LWN.net Logo

The embedded long-term support initiative

By Jonathan Corbet
October 29, 2011
The Linux Foundation's Consumer Electronics Working Group (the group known as the Consumer Electronics Linux Forum before the two organizations merged) chose the Embedded Linux Conference Europe as the forum for the announcement of a new mechanism for the maintenance of stable kernels for the embedded industry. If all goes well, this "long-term support initiative" (LTSI) will provide better kernels for embedded systems with less effort while increasing the amount of code that the embedded industry sends upstream.

The initiative was presented by Tsugikazu Shibata, a manager at NEC and a member of the Linux Foundation's board. According to Shibata-san, current long-term supported kernels do not really meet the embedded industry's needs. Those kernels are well suited to the needs of the enterprise [Tsugikazu Shibata] distributions, where the same kernel can be used and supported for periods up to ten years. A number of those distributions are built on 2.6.32, which can expect to be supported for some time yet. Enterprise distributions last for a long time, so the kernels they use are picked rarely and supported for many years.

In the embedded world, 2.6.32 is very old news. Product lifetimes are much shorter for most embedded products; manufacturers in this area need a new, supported kernel every year. This industry, however, has no infrastructure for the support of kernels on that sort of schedule. Last year the industry came together and decided to standardize on 2.6.35, but it was a one-shot action; no plans were made to support any later kernel versions. That is a problem: but products being designed now are likely to need something newer than 2.6.35.

Another problem is that finding common ground for a standard embedded kernel is hard. Much of the industry is currently driven by Android, which releases with new kernels every six months or so. Manufacturers, though, tend to get their kernels from their suppliers; those kernels can be based on almost any version and lack any support going forward. Those manufacturers need that support. They also would really like to use the same kernel for a few generations of a product; that requires support for a period of a few years, but it also requires a certain amount of backporting of drivers and features.

Yet another problem, one that has characterized the embedded industry for years, is that there still are not enough contributions to the mainline from embedded developers (though, in all fairness, things have improved considerably). Manufacturers have a lot of patches in house, many of which do good stuff, but they are not going upstream. That imposes costs on those manufacturers, who have to carry those patches forward, and it impoverishes the mainline.

The LTSI project will have three components to address these problems. The first of those will be a long-term stable tree for the embedded industry. A new tree will be selected roughly once each year; 3.0 has been picked as the initial kernel in this series. Each long-term kernel will receive updates for two years as part of the usual stable update process; as with the other stable kernels, only bug fixes and security updates will be considered. Some sort of advisory board will be set up with a number of industry developers to pick subsequent stable kernels.

The second component will be the "LTS industry tree," which will be maintained, by CE Working Group members, independently from the regular stable updates. This is the tree that, it is expected, will actually be used in products. It will be based on the long-term releases, but will include some extras: backported drivers, perhaps some backported features, and various other vendor patches. In addition, there will be an associated "staging tree" where interesting code can be tested out prior to its inclusion in the industry tree. A separate quality assurance group will devote itself to testing the code in the staging tree and deciding when it can graduate.

The normal path for getting code into the industry tree will be to get it upstream; a backport can then be done. There is, however, an intended mechanism to take code directly via the LTSI staging tree in unusual or urgent cases. Shibata-san was clear, though, that "upstream first" will be the usual rule for this tree.

Finally, there will be an initiative to help industry engineers get their code upstream. Yet another staging tree will hold these patches while they are made suitable for inclusion into the mainline.

While it is assumed that the embedded industry is carrying a lot of code internally, nobody has ever really known for sure. To get a better handle on how much out-of-tree code exists, the Working Group launched the "Yami-nabe Project." Yami-nabe parties are, evidently, an old, no longer observed Japanese custom. Everybody would show up to a dark room containing a large pot of water; each would bring some item of food and toss it in. Everybody would then eat the resulting soup without knowing what was in it or where it came from - whether they liked it or not.

The modern form of a Yami-nabe party, it seems, involves collecting out-of-tree patches from manufacturers without tracking where they came from. None of the companies involved want to have their work compared to that of others. The collected code was examined to see how much of the kernel had been modified, and how much duplicated code there was. Turns out there were a lot of files in the kernel that most or all manufacturers felt the need to modify; much was in architecture-specific and driver code, but there were a lot of core kernel code changes too. Quite a few vendors had made similar changes in the same places.

So clearly there is some duplication of effort going on. The LTSI tree, Shibata-san said, should help to reduce this duplicated effort and to reduce the kernel fragmentation in the embedded industry. Throw in some help with the upstreaming of code and, they hope, fragmentation in this area can be eliminated entirely - or at least something close to that.

The first kernel will be 3.0-based which is, not coincidentally, the base for the Android "ice cream sandwich" kernel. They hope to have an industry kernel release available sometime in the first half of 2012. After that, it's a matter of seeing what kind of uptake there is in the embedded industry. It seems, though, that quite a few companies are interested in this project, so the chances that they will at least look at its output are pretty good.

[Thanks to the Linux Foundation for supporting your editor's travel to Prague to attend this event]


(Log in to post comments)

The embedded long-term support initiative

Posted Oct 29, 2011 10:21 UTC (Sat) by kragilkragil2 (guest, #76172) [Link]

Nice start and definately an improvement. But shouldn't the long term goal be way longer support?

I don't want my TV,NAS,Router etc to be only secure for 2 years. Maybe just because not a lot of bad things happened so far doesn't mean they aren't to in the future. A world stuffed with vulnerable Linux devices that soon posess a lot of computatinal power and storage are a big target for spammers, crackers and spies. Especially with XP dying and Windows getting more secure.

In my perfect world embedded devices would standardize more so that some kind of automated update system would work and each device would pull updates automatically (with enough checks and backups so that very rarely something breaks)

It is not easy, but definately possible. But I guess consumers would need awareness to vote with their money units to really make it happen ;-(
I would love to see awareness about community supported devices grow. There are a few communities that provide easy updates. Most of them are way too hard for your average guy, that would need to change but OEMs could help there too. If they don't do more they are at risk that someone like Apple might come in and just improves on the current state of things a bit, but offers better support with new features for years. All the OEMs need to understand that they are selling products that are mostly defined by software now.

The embedded long-term support initiative

Posted Oct 29, 2011 11:02 UTC (Sat) by cortana (subscriber, #24596) [Link]

A perverse problem is that customers *hate* updating devices. Everyone I know who owns a Playstation 3 complains when Sony push a mandatory update down their throat. They just want to play a game, or watch a Blu-ray, not have to wait 30 minutes for an update to be downloaded and installed, together with the presentation of a new multi-page EULA displayed in a tiny unreadable font. And Sony probably have the best, most streamlined update process of any embedded device vendor! People like me are stuck with an old, known buggy version of Android because their mobile phone model was discontinued a year ago and their phone company don't have any incentive to push out a software update.

IMO, updates have to be re-thought completely. The ideal system will update automatically when the device is not in use; will perform updates silently, in the background, without imposing any interruption upon the end-user (and yet, the updates have to be 'in effect' immediately; e.g., quitting and restarting the updated movie playing application in the device without dropping a frame or interrupting the audio); will never cause any regressions; will not be spoofable by a nefarious third party that wants to take over my TV in order to send spam (or display fake news stories); and will not be deferrable, or even disable-able, without disconnecting the system from the network entirely. A tall order!

The embedded long-term support initiative

Posted Oct 29, 2011 11:40 UTC (Sat) by kragilkragil2 (guest, #76172) [Link]

They don't hate updating their Chrome browsers, because they hardly ever notice. It is all about the way you do it.

And I have to strongly disagree with Sony being a good example.
Their updates are way too big, dumb and really are mostly made of fail:
http://arstechnica.com/gaming/news/2011/09/resistance-3-h...
http://arstechnica.com/gaming/news/2010/08/dear-sony-ther...
They should provide delta updates, that download in the background and are fast to install, all of which they don't do.

MS updates Xbox games way faster and most of their OS/firmware updates for Xbox are tiny and fast, only twice a year there is a big update.

The embedded long-term support initiative

Posted Oct 29, 2011 16:19 UTC (Sat) by fuhchee (subscriber, #40059) [Link]

"Everyone I know who owns a Playstation 3 complains when Sony push a mandatory update down their throat."

That's partly because Sony is known to regularly abuse their customers this way.

The embedded long-term support initiative

Posted Oct 29, 2011 16:23 UTC (Sat) by cortana (subscriber, #24596) [Link]

Ugh. My point is that at least Sony bother to push out updates--unlike HTC or Samsung, who don't bother (not singling them out--I just happen to own devices made by both that I know are based on Linux, and have security vulnerabilities, and that don't have any updates available). And yet, the average user may _prefer_ the vulnerable device because it's less hassle to use--no pesky updates interrupting them!

The embedded long-term support initiative

Posted Nov 2, 2011 15:05 UTC (Wed) by jmm82 (guest, #59425) [Link]

People do not like the PS3 updates because they often lock down the system more and take away liberties they once had.

People do not like updating the firmware on their router because everyone has updated a router that worked perfectly fine only to either a. brick the device or b. have some feature that used to work suddenly intermittently fail.

Updating a kernel on a stable system is risky business. The people here should know that more than anyone, yet if security is a priority then it must be done.

The embedded long-term support initiative

Posted Nov 2, 2011 16:10 UTC (Wed) by dlang (✭ supporter ✭, #313) [Link]

If the updates add additional features and seldom fail then customers generally like them

It's when the updates either don't add any new features, or worse, remove existing features (even if they add other new features), then users complain

This isn't limited to firmware/kernel updates.

The KDE and GNOME '.0' updates are perfect examples of updates that remove some things that existing users notice and therefor people are unhappy with them, even though the developers add a lot of new features as well.

The embedded long-term support initiative

Posted Oct 30, 2011 7:33 UTC (Sun) by jcm (subscriber, #18262) [Link]

I hate updating devices but for another reason: realistic pragmatism.

While it's fun to apply random software updates when you're hacking on some gadget, I personally like my technology to work when I need it to. Updates that fix one liner security bugs are useful to have and I am glad they are made available. But updates that introduce a whole new kernel version or make other non-critical updates (perhaps even adding new features) do me a disservice as a consumer. When these changes are made, they introduce the risk that the update breaks something I am relying on for production use. Most non-hackers also realize this reality of life. They know that computers are not perfect, updates often seem to introduce problems, and things are working "fine" today so they don't need "fixing". The real solution is to provide rock solid security and other critical fixes only so that consumers will feel safe in applying updates in the future.

"Oh but Jon, you're just...". Yea. I am. For the same reason I own three Android phones and deploy all Google supplied OTA updates to one test phone before allowing my production phone to update, or stage other updates before allowing them near production machines. The last OTA update to my Android phone running stock Nexus S software was supposed to contain a fix to the previously issued OTA update that I had not yet received and which would have broken tethering. Alas, this OTA update with the "fix" also breaks tethering on my staging one. By having an interim staging testbed I am able to avoid a crticial feature breaking for me because I didn't allow it near my phone. At least the phone gave me an option of deferring the update. Had it taken the "we know better" approach of updating automatically without any choice, I would have no useful way to connect on the go at this point. So, automatic updates without a choice to skip them are bad, updates that provide other than critical fixes are unwanted, and long term support should focus on what I actually want: just the bare minimal set of fixes until I choose to go to a newer product or revision of the software for that product.

Jon.

The embedded long-term support initiative

Posted Oct 30, 2011 11:34 UTC (Sun) by nix (subscriber, #2304) [Link]

Quite so. There's another thing: reversion. On your local machine you can revert failed upgrades or upgrades you just don't like easily, but mobile devices rarely give you that freedom. Also, because of the absence of anything like a boot menu, if the update is completely nonfunctional you have a brick.

So they are less like PC kernel upgrades than like BIOS upgrades. Quick, who here keeps their BIOS religiously up to date? I think I've upgraded mine once, when I had a serious bug I had to fix, and never again, because every upgrade carries with it the possibility of bricking your machine (and is there a recovery path other than pulling the chip and putting in a new one? Do you *have* a spare? I doubt it.)

Now thanks to ACPI and SMM, BIOSes can have security holes in them -- in fact given their general code quality I suspect they are a mass of security holes packed edge-to-edge with no space between. But *even so* I don't upgrade unless I must, and upgrading fills me with trepidation. Embedded and mobile devices are just like that, except that if the upgrade is automatic you don't even get a chance to say no (or in the case of games consoles 'hey, I pay for my bandwidth, you bastard!')

The embedded long-term support initiative

Posted Oct 30, 2011 15:37 UTC (Sun) by kragilkragil2 (guest, #76172) [Link]

IMO the suckage of current update mechanisms is no good argument for not having a sane automatic one in the future.

Chrome and ChromeOS seems to do it mostly right, so it not impossible.

Automatic delta updates on two OS partitions are a sane solution. Of course they should be better tested than what Google provides OTA atm, but in case you miss tethering you have the old OS partition without the updates to go to.

The embedded long-term support initiative

Posted Oct 30, 2011 21:09 UTC (Sun) by nix (subscriber, #2304) [Link]

Indeed not. I wasn't trying to argue against proper update mechanisms, with reversion and non-insane bandwidth requirements and maybe even *gasp* a changelog you can see as the download happens.

I was just pointing out just how far from that the current state of the art is.

The embedded long-term support initiative

Posted Oct 30, 2011 22:04 UTC (Sun) by jcm (subscriber, #18262) [Link]

The problem with partitions and current rollback is that it doesn't actually work. What actually happens is that your data sitting on a separate partition from the system is automatically converted to whatever newer incompatible format has been shoved into the update with no thought to rollback.

No. I don't trust updates to working production systems. I already have to duplicate everything I use to work around broken design. The fault is with changing what works. If it works, don't break it until the world moves to less hackish approaches to software platform consistency.

Jon.

The embedded long-term support initiative

Posted Oct 31, 2011 19:06 UTC (Mon) by dlang (✭ supporter ✭, #313) [Link]

Tivo does this right. the data is on a separate partition, and they do not change it to the 'flavor of the day' format in the update.

10 years worth of updates and still running

Bwahahah

Posted Oct 31, 2011 7:35 UTC (Mon) by khim (subscriber, #9252) [Link]

I wasn't trying to argue against proper update mechanisms, with reversion and non-insane bandwidth requirements and maybe even *gasp* a changelog you can see as the download happens.

It's not even funny. When geeks start talking about "proper update mechanism" they invariably raise insane requirements which are totally pointless and useless.

Most people encounter "almost perfect" update mechanism (with almost 100% updated devices!) every day and don't even know that. Where are these devices? How come noone talks about them? Well... why should they? They "just work"(tm) - and that's enough. And people cheerfully keep then up-to-date without any complains!

Heresy? Fantasy? Nope. I'm talking about your cable TV set-top box. These are quite complex devices novadays (with built-in browser, the ability to write and replay TV programs, etc, etc). They are regularly updated (when TV company decides to sell you new capabilities) - and people rarely complain.

Note: no reversion mechanism, no "sane bandwidth requirements" and no changelogs. Just two things are required:
1. It must be invisible (i.e.: update is pushed not when it urgently needed but slow and steady - this way bandwidth requirements are hidden).
2. "It must work"(tm): if update goew haywire then it's not something user should fix somehow, this occasion should be treated the same way power brick explosion is treated (and it should happen rarely - like power brick explosions).

ChromeOS updates are decent imitations but sadly they are not as robust for now...

Bwahahah

Posted Oct 31, 2011 15:50 UTC (Mon) by aginnes (subscriber, #81011) [Link]

Some set-top boxes have an architecture where they keep two copies of the system software so they can revert to the previous version if it all goes wrong.

It's a lot easier for the set-top box software to be tested before it is downloaded, plus they own the broadcast bandwidth, so they can use as much as they want. The cable (or satellite, or IPTV) company is operating in a highly constrained environment, they control what devices are connected to their network, they know exactly what hardware is out there. So they can test the software on every different type of hardware that they have deployed before it goes out. This may take them a couple of months. Of course they have a big incentive to make sure their updates don't break the box as every call to their customer support call centre costs money, and if they brick the box, a truck roll costs an arm and a leg!

So updating set-top boxes in a closed environment is a significantly different (and easier) problem to a general "update Linux on any embedded device".

Bwahahah

Posted Oct 31, 2011 16:21 UTC (Mon) by martinfick (subscriber, #4455) [Link]

They also do not have any real user data on them, which as jcm points out is usually the real problem with updates.

Of course they do!

Posted Oct 31, 2011 20:15 UTC (Mon) by khim (subscriber, #9252) [Link]

Even older boxes had customization capabilities (for example you had the ability to select few "favorite" channels). Newer ones often include DVR capabilities and video rent capabilities so personal information is most definitely there.

What you probably meant is "they contain no unclassified personal data in there"... but is a good thing to have on your embedded device? We are entering era of parental computing - and set-top boxes show that people are quite willing to accept it if it means hassle-free gadgets.

Sure...

Posted Oct 31, 2011 16:58 UTC (Mon) by khim (subscriber, #9252) [Link]

So updating set-top boxes in a closed environment is a significantly different (and easier) problem to a general "update Linux on any embedded device".

Sure. But my point still stands:
1. It's possible to safely upgrade software in a device.
2. People tolerate updates just fine as long as they "just work".
3. Reversions, bandwidth requirements and changelogs don't affect acceptance at all.
3a. As you've noted people rarely care about bandwidth, but they do care about money spent on bandwidth. This is different - and solveable - problem.

Some set-top boxes have an architecture where they keep two copies of the system software so they can revert to the previous version if it all goes wrong.

Sure, but this is the same approach ChromeOS is using. This is mereluy detail of upgrade mechanism implementation which makes it more robust. I've never seen a set top box which allowed you to arbitrarily select one of two version of software to boot. Sometimes it can be accomplished using some combinations of knobs, but it's usually part of "recovery procedure", not something end-user is supposed to do.

When (and if) btrfs will be mature enough snapshots can do the same with smaller overhead.

Bwahahah

Posted Oct 31, 2011 17:43 UTC (Mon) by nix (subscriber, #2304) [Link]

Yeah, that would be enough... if you trusted things of the complexity of modern software upgrades to 'just work'. All too often, they don't. All too often, even BIOS upgrades don't, and BIOSes are not very complex compared to most software.

BIOS is certainly less complex then set-top box :-)

Posted Oct 31, 2011 20:50 UTC (Mon) by khim (subscriber, #9252) [Link]

Sure, BIOS updates sometimes fail - but that just means they were not designed to be constantly updated. Aforementioned set-top boxes are significantly more complex then BIOSes - yet they are routinely upgraded with no ill effects.

Actually another example fits as well: PS3 upgrades may be annoying because they routinely remove features, but situation when they break something they are not designed to break are exceedingly rare.

It's not hard or impossible to develop thing which can be safely auto-updated, that just means that you must severely restrict it. Complexity has nothing to do with the ability to reliably upgrade something. Diversity is the killer.

BIOS is certainly less complex then set-top box :-)

Posted Nov 1, 2011 14:22 UTC (Tue) by nix (subscriber, #2304) [Link]

True. I suppose the problem with BIOS updates is mostly the diversity of the hardware the BIOS is used with, and the near-impossibility of thoroughly testing it, not with all conditions found during use, but with all conditions found *at boot* or even during the running of an update: hence the relative likelihood of boot-time failures bricking your system after a BIOS upgrade.

I do wonder, though, how much of the set-top boxes' firmware is actually upgraded by an update. If the components necessary to boot and do an update don't change, then that might reduce the likelihood of failure -- though the fact that we rarely see them fail in any way suggests that this is not the problem.

So... continuous upgrading works for stuff (even complex stuff) that runs in simple or consistent environments or that is not itself necessary to run the upgrade process. That's probably enough for embedded stuff, since their environments are normally nailed down by the vendor and all variations known. Just look out for embedded systems that run as part of larger systems, and whose upgrading is controlled by the upstream vendor: those may not have been tested in the environment they're being upgraded in. (This, too, is probably rare: the integrator would probably want to control the upgrade stream in such a case, since it's them on the line if things go wrong.)

Bwahahah

Posted Oct 31, 2011 17:47 UTC (Mon) by raven667 (subscriber, #5198) [Link]

You've just described how the bandwidth requirements are managed and these devices certainly do allow updates to be reverted so I'm not sure you made the point you wished to make. Further more cable boxes are often owned by the cable company, not the subscriber, and they certainly do test and read changelogs before updating their own hardware. Are you suggesting that end-user hardware should only be rented and not owned?

Who owns the system, who is responsible for doing the maintenance? That's the person who wants access to manage revisions and read changelogs although certainly they decide to just apply changes as they become available. Having a robust mechanism makes that easier.

Perhaps your providers are different from mine...

Posted Oct 31, 2011 21:11 UTC (Mon) by khim (subscriber, #9252) [Link]

These devices certainly do allow updates to be reverted.

Technically - yes, they can. But then they next picture you'll see on TV when you'll connect them to network is "please wait while system update is installed". IOW: revert capability is only there to assist in upgrade process, it's not designed to be used by end-user (except when directed by tech support to "properly" upgrade device).

Further more cable boxes are often owned by the cable company, not the subscriber, and they certainly do test and read changelogs before updating their own hardware.

Does not make any difference in all cases I've encountered: yes, you can buy the box - but this only affects you monthly payments, if you buy it you can do whatever you want with it till you connect it to cable network. But if you do want to connect it to cable network - you automatically agree to install network's firmware.

Who owns the system, who is responsible for doing the maintenance?

These are two different question - and they naturally have different answers. That's true for plumbing, electric wiring, cars and so on... so why should it be the same for computers and gadgets?

That's the person who wants access to manage revisions and read changelogs although certainly they decide to just apply changes as they become available. Having a robust mechanism makes that easier.

Right, but how to help the guy who actually does maintenance is separate issue from how to push updates to end-user. Even there changelogs are not all that helpful: someone who's responsible for pushing updates to end-user does not need to know what changes are there, s/he only needs to know what can go wrong and how to fix the problems.

Bwahahah

Posted Nov 1, 2011 1:49 UTC (Tue) by foom (subscriber, #14868) [Link]

Sorry, but your example doesn't work for me. The number of people I know who have had a software update to their leased DVR set-top-box erase all their stored shows is quite high.

Even though that's the case, people still don't tend to complain...very much...because it's just recorded TV programs, after all, not something *important*.

The embedded long-term support initiative

Posted Oct 30, 2011 16:59 UTC (Sun) by andreasb (subscriber, #80258) [Link]

BIOS flashing and recovery isn't as bad as it was 10 years ago. Just as an example, the Asus P5Q board I have here goes into an automatic recovery mode when it finds a bad BIOS checksum where it will attempt to find a BIOS image on CD, floppy or USB stick and flash it. Plus it has a menu driven BIOS flashing program in the main BIOS for regular upgrades.

Thankfully no comparison to the olden times when you had to boot DOS from floppy to do an upgrade and had to replace the flash chip (or at least reprogram it outside the system) when it got corrupted…

The embedded long-term support initiative

Posted Oct 30, 2011 21:12 UTC (Sun) by nix (subscriber, #2304) [Link]

Unfortunately, those olden times are still here for pretty much everyone but Asus (who remain excellent). e.g. the Tyan mobo in my server has a DOS flashing program and can only boot from CD, hard drive, or floppy. The machine doesn't have a floppy drive, so in order to flash the BIOS I have to find a way to get DOS onto a bootable CD, then find a way to interact with a headless machine in order to run the flashing program and tell it to get going... and if it goes wrong there is of course no reversion mechanism at all.

But why would you want an error recovery mechanism on great big servers as good as the one on J Random Nobody's desktop? :(

The embedded long-term support initiative

Posted Oct 30, 2011 17:38 UTC (Sun) by raven667 (subscriber, #5198) [Link]

At least for servers I generally try to track the dell firmware. They don't seem to update unless there is a real reason so it's not so much breakage due to churn as it is avoiding already-fixed problems. There's not much worse than encountering a hardware crash bug that might have been fixed in a newer firmware revision.

The embedded long-term support initiative

Posted Oct 30, 2011 19:48 UTC (Sun) by epa (subscriber, #39769) [Link]

Surely the answer is to stop relying on updates and patching to fix security holes after the fact. The software needs to be designed so that you know with reasonable certainty that it is secure as shipped, and further measures need to be in place to make sure that even if there is a vulnerability in one part of the system, it doesn't matter much. We have grown inured to releasing software with serious flaws* and patching it later. This would not be acceptable in any other industry. Yes, recalls and field modifications do happen, but they are the exception and considered an embarassment for the company that shipped a faulty product.

* For the sake of argument, anything that lets an attacker get your credit card number without lots of social engineering may be considered a serious flaw.

The embedded long-term support initiative

Posted Oct 30, 2011 22:24 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

I struggle to imagine how any other industries could begin to compare. We're not talking about something like the lever or wheel. Programmable computers escape our ability to reliably understand what they're doing long before the scale where they're of any use to us, as Radó noticed.

You can try Formal Methods. If you're building a bomb, or maybe the core control systems for a nuclear reactor, you might persuade someone to pay for that. For their extra money, they will get something far harder to use and far less capable than everything else in the world, but very slightly more reliable.

The embedded long-term support initiative

Posted Oct 31, 2011 16:11 UTC (Mon) by zlynx (subscriber, #2285) [Link]

Formal Methods.

Hahah.

With those, all you have done is written the program twice. Once in the language and once in the formal verification.

(Almost) all of the mistakes it is possible to make while programming are possible to duplicate in the formal methods.

So if you have a security flaw where you have forgotten to specify that Method A must not be called before passing Security Check A, or if the specification does not spell out that calling Method B invalidates the current security state. That flaw will exist no matter how thorough the formal methods are.

The embedded long-term support initiative

Posted Nov 1, 2011 11:57 UTC (Tue) by dps (subscriber, #5725) [Link]

Formal methods are much more reliable than most programming languages. Just writing an unambiguous specifications helps. Formal methods also allow non-negotiable proof that a proposed solution meets the specification.

You can also prove things like *any* network of the stated components with an infinitely readable external input is deadlock free. I am not aware of any programming language that can do that.

In generally the experience is that using formal methods makes the initial phases longer and the final phases shorter, partly because a lot of problems and discovered sooner. Moves like only proving critical safety properties or analysing a high level design are popular.

As with any tool formal methods can be misused. I not only know formal methods but believe I have the taste to know when *not* to apply them.

Secure Software and Formal Methods

Posted Nov 2, 2011 11:18 UTC (Wed) by abacus (guest, #49001) [Link]

As far as I know formal methods are fine for functional specifications. I'm still waiting for a formal specification of a secure operating system though. See e.g. Gerwin Klein e.a., seL4: formal verification of an operating-system kernel, Communications of the ACM, Volume 53 Issue 6, June 2010.

Secure Software and Formal Methods

Posted Nov 2, 2011 17:33 UTC (Wed) by zlynx (subscriber, #2285) [Link]

My complaint about formal methods is that they aren't any kind of cure-all. They rely on the specifications being complete and that the specifications are correctly translated into the formal verification method.

In almost all programming, the specifications are incomplete or incorrect. Programmers, even without realizing what they are doing, fill in the gaps in incomplete specifications and may not even realize the specification was missing information.

Many bugs are caused by programmers on each side of an interface assuming different things about the specification. An example of this is the POSIX select() timeout parameter.

To fully specify the behavior of a system under all conditions is in my opinion, as prone to oversights and misunderstanding (aka bugs) as writing the program itself in assembly or C. (Well, you can make a lot more typo type mistakes writing in C, but I think it holds for logic errors.)

Reading through some of that seL4 paper you linked, I see they had to make about 300 changes in the spec during the process. In other words, they had to debug it. I wonder how they know if they found all the problems in their spec?

Who will pay for it?

Posted Oct 31, 2011 7:59 UTC (Mon) by khim (subscriber, #9252) [Link]

Surely the answer is to stop relying on updates and patching to fix security holes after the fact.

I doubt it.

The software needs to be designed so that you know with reasonable certainty that it is secure as shipped, and further measures need to be in place to make sure that even if there is a vulnerability in one part of the system, it doesn't matter much.

Do you propose a new law? What measures do you propose to make sure software development will not move to other, more liberal, countries?

Because without government mandate any company which will try this approach will just go bancrupt (by the time it'll release anything market will move to the "next big thing") so you'll need something big to make it happen. Do you really believe bureaucracy struggles over such mandates (this what will actually happen, I doubt quiality of the software itself will grow all that much) will actually make your life easier or better?

This would not be acceptable in any other industry.

Are you sure? From what I'm seeing other industries view this as a problem not as an achievement and move to "built-in computer with updates" model where they can. Sure, centuries-old industries are too conservative to eploy such ideas, but other, newer, industries... mobile phones, TV sets, blu-ray players and so on: they all switched from "what you buy is what you'll use till device will die" to "we'll fix any bugs later" model - and I'm sure other industries will follow.

Yes, recalls and field modifications do happen, but they are the exception and considered an embarassment for the company that shipped a faulty product.

They were exceptions and were considered as embarassment because they severely affected the bottom line. Now, when they are cheap... situation is changing. Sure, we'll not see upgradeable car computers any time soon (because cars have a lot of legal requirements around them), but "entertainment centers" in cars soon will surely follow the same model. Actually they were replaceable for a very long time so you can view them as precursors of today's "ship then fix" model...

Who will pay for it?

Posted Oct 31, 2011 15:19 UTC (Mon) by epa (subscriber, #39769) [Link]

Relying on updates to fix security holes does not work! It hasn't worked to keep us secure over the past twenty years, what reason is there to suppose it will work for the next twenty?

You mention mobile phones, Blu-ray players and so on. Those are all parts of the software industry. When I said 'this would not be acceptable in any other industry' I was referring to industries other than software. It is not good enough to put up an unsound building and return to fix it later when flaws are discovered. (It does happen, but is rare and shaming for the architects and builders involved.) You can't sell a desk lamp with unsafe wiring and rely on asking people to take it back to the store later. Only in the software industry do we try to get away with such practices. And as you say, the market usually tolerates it.

Who will pay for it?

Posted Oct 31, 2011 16:38 UTC (Mon) by mpr22 (subscriber, #60784) [Link]

In the building case: Correctness is feasible to achieve and validate, and the cost of remedying errors is relatively large.

In the desk lamp case: Correctness is not merely feasible but trivial to achieve and validate.

Software that does what there is currently perceived to be a demand for often lies somewhere between insanely hard and mathematically impossible to achieve and validate correctness of. The cost of remedying an error, on the other hand, is not strongly related to the severity of its consequences. Many disastrous software errors turn out to have trivial fixes.

(And, of course, even if your software is provable, all you can prove is that it conforms to the provided specification. Proving that the specification is a correct statement of the requirements, or that the requirements were well-formed in the first place, is a separate problem.)

What are you talking about?

Posted Oct 31, 2011 16:39 UTC (Mon) by khim (subscriber, #9252) [Link]

Relying on updates to fix security holes does not work!

On the contrary: it works very well indeed. The companies who use this approach survive and thrive. The companies who lost the wind and tried to fix all the bugs before shipment are history.

You mention mobile phones, Blu-ray players and so on.

Yup.

Those are all parts of the software industry.

Not even close. First mobile network started operating back in 1979, it was analogue and had nothing to do with software. First LD player was on sale year before that - and Bly-ray is it's direct descendant (from end-user POV). And first TVs were created even earlier: it was introduced back in 1928 and most definitely had nothing to do with software.

Only in the software industry do we try to get away with such practices.

Again: not true at all. Lots of industries use this approach too: mobile phones, credit cards, etc. Initially they had pathetic security but since they were convenient they were used anyway. Later, when frauds become a problem additional layers of security were added. The same happened with printed banknotes few centuries before. You can go back few thousands years (when first stamps and other similar tools were invented) - and see the very same picture. Again: special inks, papers and procedure and so on followed, not preceded.

In fact where information is exchanged "rely on updates to fix security holes" is typical approach, not an exception. The only thing software introduced is "fast" updates. When you introduce new, more protected, banknote you must to this in slow, very spread-out manner. But when new software is created to patch vulnerability... you can push it in hurry.

So no, I don't believe in "fix all the bugs before shipment" approach. It failed us for thousands of years - why do you think it can be fixed now?

Who will pay for it?

Posted Oct 31, 2011 16:58 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

But we aren't talking about unsafe wiring in a desk lamp. Even if we were I'd argue that spending two hours travelling on the bus with a microwave (it caught fire spontaneously after ~8 months use) to get it replaced was a lot more hassle than any software update I've ever undertaken.

Your building example was better. But in fact it's completely routine to "snag" large office or industrial buildings.

When I helped take possession of a four story building in 1998 we were all issued with a sheet of orange stickers labelled "snag". Each problem we discovered was to be marked with a label, and added to a list maintained by the liaison to the building contractor. Some problems were corrected in a few days to our satisfaction, as well as if they'd been corrected during construction. Many were "bodged" so that they met the strict letter of the requirements, but were not really adequate (e.g. it's much cheaper to fiddle with the hinges on a door to make it close, for a few days until it settles again, than to buy and install a new door which fits properly). A few simply couldn't be fixed at all, no way around it without tearing down the building and beginning anew. Some orange stickers for those last problems remained as visible sores on the pristine new building until their adhesive failed years later.

If we added to the "snags" every conceivable way a resourceful and determined attacker could get into the building, we'd never have finished. What if they just drive a truck through the large glass frontage? What if they tailgate a legitimate employee? What if they pay someone to pull the fire alarm, dress as firemen, and just walk in?

Who will pay for it?

Posted Nov 1, 2011 2:02 UTC (Tue) by foom (subscriber, #14868) [Link]

> It is not good enough to put up an unsound building and return to fix it later when flaws are discovered. (It does happen, but is rare and shaming for the architects and builders involved.)

On the contrary, putting up a building with serious "bugs" is exceedingly common. Where it matters most (building will fall down), extra care is taken to make sure it won't fail in that way, but, for all the other ways a building can be broken, it's quite common for a new building to in fact *be* broken in all of those ways.

E.g. leaking excessive amounts of water, light switches are in stupid places that don't make any sense, HVAC doesn't work right (hot/cold areas), architect decided not to put stairwells in elevator lobby (but rather halfway across the building in a supposed-to-be-locked area), so it turns out it's illegal to actually lock the entrance to the floor from the (public) elevator lobby, etc...

Who will pay for it?

Posted Nov 2, 2011 8:15 UTC (Wed) by ekj (subscriber, #1524) [Link]

Writing bug-free software, does not work. As in, literally nobody has figured out how to do it, not even with near-infinite budgets for near-trivial computations.

Even those situations where we use the very strictest of quality-controls, and as a result end up paying orders of magnitude more than we would with "normal" software, we still get banal, -stupid- bugs like the Mars Climate Orbiter doing lithobraking due to one software-module using imperial units rather than metric like the rest of the software. (i.e. bugs not unlike those typical of normal off-the-shelf software)

In most lines of bussiness, simply *trying* to do software like that, would guarantee bankruptcy. There's a reason things are done the way things are done.

Who will pay for it?

Posted Nov 3, 2011 9:53 UTC (Thu) by jschrod (subscriber, #1646) [Link]

> It is not good enough to put up an unsound building and return to fix it
> later when flaws are discovered. (It does happen, but is rare and shaming
> for the architects and builders involved.)

I wish you lots of luck if/when you'll build your first house and that fantasy will be rather rudely destroyed.

Who will pay for it?

Posted Oct 31, 2011 15:22 UTC (Mon) by epa (subscriber, #39769) [Link]

...and obviously, an entertainment centre that has bugs is fine, because it is not safety critical and cannot be used to steal money from you. My point is that things which are important must be held to a higher standard than that used for video games.

Who will pay for it?

Posted Oct 31, 2011 15:35 UTC (Mon) by KSteffensen (subscriber, #68295) [Link]

An entertainment center used to show movies from some Netflix-like service could conceivably hold credit card information. so even things that don't immediately seem critical might be.

And some of us certainly wouldn't mind video games being held to a higher standard than what is currently the case.

Actually it CAN steal money from you - and that's the point

Posted Oct 31, 2011 16:48 UTC (Mon) by khim (subscriber, #9252) [Link]

...and obviously, an entertainment centre that has bugs is fine, because it is not safety critical and cannot be used to steal money from you....

It can be used for that - and that's the point.

Money themselves (coins and banknotes) were one of the first objects which employed "security updates" in it's design.

"Security updates" are not something invented with software - on the contrary, they have centuries long history! And yes, they worked. Frauds happened all the time, but as long as they were rare enough "security patches" worked. They worked for so long - why do you think tomorrow everything will suddenly collapse? What exactly changed today?

The embedded long-term support initiative

Posted Oct 29, 2011 14:12 UTC (Sat) by xxiao (subscriber, #9631) [Link]

3.0 is a good pick, not only because of android 4.0, it also has the newest RT kernel patches (some are out of the tree)

i hope this effort can sync up with the RT developers in the future.

Peer review ?

Posted Oct 29, 2011 14:53 UTC (Sat) by xav (guest, #18536) [Link]

It looks to me the intended process is to collect patches without the kind of review one can witness on lkml. Won't that be a way to fasttrack dubious changes to the main tree ?

Peer review ?

Posted Oct 31, 2011 12:52 UTC (Mon) by armijn (subscriber, #3653) [Link]

The fasttrack will only be used in some "unusual and urgent" cases (as the article says, although I don't know what will be unusual and urgent cases, but I am assuming and hoping that the maintainers will be very conservative) and the default will be "upstream first". Before things will end up in the mainline kernel.org tree they will be peer reviewed anyway.

Peer review ?

Posted Nov 3, 2011 5:55 UTC (Thu) by josh (subscriber, #17465) [Link]

Fast-tracking changes into this tree will not get them into the kernel, and will almost certainly get them more scrutiny and skepticism, not less. This sounds more like a variant of staging.

The embedded long-term support initiative

Posted Oct 29, 2011 16:21 UTC (Sat) by fuhchee (subscriber, #40059) [Link]

"only bug fixes and security updates will be considered"

How do they intend to divine which mainline patches are security-sensitive?

The embedded long-term support initiative

Posted Oct 30, 2011 7:23 UTC (Sun) by raven667 (subscriber, #5198) [Link]

They will make a guess, which will necessarily be incomplete, such that their secure kernel is still vulnerable to any number of issues that haven't been publicly discovered yet. Vulnerabilities are like that, they exist for years before anyone notices them.

The embedded long-term support initiative

Posted Oct 30, 2011 8:44 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]

Who cares? Kernel is insanely vulnerable anyway. The only sensible way it to minimize the attack surface as much as possible.

The embedded long-term support initiative

Posted Oct 30, 2011 14:56 UTC (Sun) by vonbrand (subscriber, #4458) [Link]

Kernel is insanely vulnerable anyway.

Care to quantify and prove this?

The embedded long-term support initiative

Posted Oct 30, 2011 17:43 UTC (Sun) by raven667 (subscriber, #5198) [Link]

I think we are past that now, the lead kernel developers don't believe or claim that one can successfully run malicious local processes without the possibility of local root compromise. Look at kernel.org where they are no longer giving away shell access just for fun to reduce the attack surface area. No one is making the claim that the linux kernel is sufficiently free of vulnerabilities that one could host multiple malicious users safely on the same system. The talk now is about virtualization for security isolation, what is being done on a single system image isn't good enough.

The embedded long-term support initiative

Posted Oct 31, 2011 10:43 UTC (Mon) by paulj (subscriber, #341) [Link]

Except the virtualisation systems, common ones at least like Xen and Qemu/KVM, don't seem to take any different approach to secure programming than the kernel does. They offer no more assurance of security than the kernel. While they might have fewer interfaces to their host than the kernel does to regular users, those interfaces can be very very complex (because performance is so important) and even arcane (e.g. for compatibility with x86 distributions - applies even with Xen sometimes). Xen and KVM regularly have issues that compromise host security.

Virtualisation does not seem a solution to me. Any systematic solution to security of hypervisors seems like it'd apply equally well to traditional kernels, surely?

The embedded long-term support initiative

Posted Oct 31, 2011 18:04 UTC (Mon) by raven667 (subscriber, #5198) [Link]

Sure, what you say is true but the important point is that the interface between the OS kernel and the Hypervisor is much smaller and more rigidly defined than the interface between a user process and an OS kernel. The Hypervisor has orders of magnitude fewer features and attack surface area and is therefore more practical to usefully validate.

The embedded long-term support initiative

Posted Oct 30, 2011 21:37 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]

raven667 said pretty much what I think.

Linux may be somewhat secure if one limits it to simple routing and firewall-related tasks. It's certainly not secure if one decides to use it, for example, to host world-accessible NFS shares or try to contain malicious local users.

And by this point in time, it can't really be fixed short of rewriting it in a safe language.

The embedded long-term support initiative

Posted Oct 31, 2011 7:37 UTC (Mon) by Lionel_Debroux (subscriber, #30014) [Link]

Perhaps the number of vulnerabilities (multiple forms of DoS, information leaks, etc.), of various severity, which affect the kernel (as a special case of a huge piece of complex software), and are introduced and fixed by dozens every major kernel release ?
Counting CVEs is a weak measurement for the number of vulnerabilities, since only a small subset of vulnerabilities gets a CVE number.

A way to get a more secure Linux kernel is to use the large PaX/grsecurity patch, which prevents a number of classes of vulnerabilities from successful exploitation.

The embedded long-term support initiative

Posted Oct 30, 2011 10:00 UTC (Sun) by dirklwn (subscriber, #80581) [Link]

Looking from an ARM point of view, in a small private discussion after Tsugikazu Shibata's talk, there came up some questions:

What's the difference between this LTSI and what Linaro is already doing? Will LTSI have the man power to do what they plan? Where should this man power come from?

From an ARM point of view, it sounds like there is some overlapping between what Linaro is doing and what LTSI plans to do? A lot of ARM developers are working for Linaro, already. Doing parts already now what LTSI plans to do?

Wouldn't it make sense to align LTSI and the Linaro work somehow? Again, at least for the ARM world?

The embedded long-term support initiative

Posted Oct 30, 2011 15:36 UTC (Sun) by tbird20d (subscriber, #1901) [Link]

We (CEWG) are in talks with Linaro to see how we can collaborate on this. Linaro is doing some great work, which we would like to of course leverage. Hopefully, we'll be able to pick up some things from CE vendors that they wouldn't otherwise get, and get those into their tree. I'm hopeful there will be lots of code exchange between the two projects. But we'll have to see how quickly they move on to another kernel version.

Linaro has a pretty aggressive upstream-first policy (which is good, but it keeps them constantly moving forward). The version gap between what companies are actively using in their product and mainline is precisely what LTSI is targetted at bridging. I can imagine LTSI pulling features that Linaro has pushed to mainline.

The embedded long-term support initiative

Posted Oct 30, 2011 23:16 UTC (Sun) by dsimic (subscriber, #72007) [Link]

Here's how I see that -- embedded industry (and business in general) needs something they can rely on. They don't care if it's called Linux, Andoid, Red Potato or whatever. They just want something for free, for what they at least *think* somebody is standing behind.

If the mainline Linux can't go and provide that, somebody else will do (hm, Android?), taking the big piece of cake mainline Linux has at the moment. Indeed, Android will be actually no better than the mainline for someone's purpose to build a networked TV set, but Android is backed by Google and gives to the industry what they need -- felling of someone taking care about. And Google has the $$$ needed for hiring good people and making Android do what businesses need.

I'd say -- mainline, hurry up and don't shoot your own foot. ;)

See what Nokia did? They gad a golden egg called Maemo (or MeeGo or whatever the name), which could've kept them as the leader of that industry area, having everything under one roof -- own harware and own software. One can say that Maemo is a full-blown Linux with everything opened up and no Java VM's in between keeping things secure... I'll just ask if Windows Mobile is any better from that point?

Do you *really* think WinMobile is more secure than Maemo? And Maemo / MeeGo is now killed like a two-day chicken and left to rot. And what a great product that was. Now I have to throw my N900 in garbage -- because industry doesn't care about technically correct things, or things that require too much effort.

Do you really think that Nokia's CEO even knows the differenece between Win Mobile and something else? ;) It's all the same touchscreen pictures and revenue charts for him. ;) It's the same with embedded world -- they'll keep trying and go somewhere else when they reach point of "too much effort required".

The embedded long-term support initiative

Posted Oct 31, 2011 21:49 UTC (Mon) by cmccabe (guest, #60281) [Link]

> Do you really think that Nokia's CEO even knows the differenece between
> Win Mobile and something else? ;) It's all the same touchscreen pictures
> and revenue charts for him. ;) It's the same with embedded world --
> they'll keep trying and go somewhere else when they reach point of "too
> much effort required".

Nokia's current CEO, Stephen Elop, worked for Microsoft as the head of the Office business division before joining Nokia. Whether or not you believe his business decisions were in Nokia's best interests, he definitely knows the difference between Windows and something else.

The embedded long-term support initiative

Posted Oct 31, 2011 22:13 UTC (Mon) by dsimic (subscriber, #72007) [Link]

> Nokia's current CEO, Stephen Elop, worked for Microsoft as the head of the
> Office business division before joining Nokia. Whether or not you believe
> his business decisions were in Nokia's best interests, he definitely knows
> the difference between Windows and something else.

Well, I can believe that, and I already knew he previosly worked for Microsoft.

Taking everything else aside, I really can't believe how such a great CEO could've
been asleep for years, without realizing that Symbian -- Nokia's flagship and business
core -- was already dead in water for years.

So they've been floating already dead on their previous glory and reputation, and
suddenly the big CEO realized that leads nowhere, so he boldly killed everything
they've already developed in-house and went to his old home -- Microsoft.

Well, however -- I'll never again buy a Nokia, and many people I know will
do the same. When compared to Android, Windows Mobile looks like a joke.

The embedded long-term support initiative

Posted Oct 31, 2011 23:25 UTC (Mon) by cmccabe (guest, #60281) [Link]

Elop became CEO in late 2010. He wasn't around during the years when Nokia was trying to pretend that Symbian was still viable.

Back then, Symbian had more than half of the marketplace, and analysts were still producing graphs showing Android and iOS slowly gaining market share over a period of years. What the analysts didn't understand is that Symbian was a horrible platform to program for and to extend, very little loved by anyone other than Nokia itself. Nokia is still a hardware company at heart and that is what they ought to have focused on. However, it would have taken a brave CEO to do that back when all the MBAs were gushing about the cloud and the oh-so-urgent need to be "more than a hardware company."

Now Nokia's platform ambitions have come to nothing, and it looks like their hardware business is circling the drain too. The best case scenario is that they become another generic OEM for Microsoft. The worst case scenario is that they go back to making rubber boots.

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds