LWN.net Weekly Edition for September 14, 2017
Welcome to the LWN.net Weekly Edition for September 14, 2017
This edition contains the following feature content, much of which comes from the Open Source Summit North America:
- Antipatterns in IoT security: a number of mistakes to avoid when designing devices for the Internet of things.
- Signing programs for Linux: improving system security with signed executables.
- The first half of the 4.14 merge window: 8,000 patches worth of changes for the 4.14 kernel.
- Running Android on a mainline graphics stack: progress toward the ability to run an Android device using mainline kernel graphics and why it matters.
- A different approach to kernel configuration: the kernel's configuration system is awful; here is a tool that might make configuring a kernel easier.
- Mongoose OS for IoT prototyping: an operating system for small devices.
This week's edition also includes these inner pages:
- Brief items: Brief news items from throughout the community.
- Announcements: Newsletters, conferences, security updates, patches, and more.
Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.
Antipatterns in IoT security
Security for Internet of Things (IoT) devices is something of a hot topic over the last year or more. Marti Bolivar presented an overview of some of the antipatterns that are leading to the lack of security for these devices at a session at the 2017 Open Source Summit North America in Los Angeles. He also had some specific recommendations for IoT developers on how to think about these problems and where to turn for help in making security a part of the normal development process.
A big portion of the talk was about antipatterns that he has seen—and even fallen prey to—in security engineering, he said. It was intended to help engineers develop more secure products on a schedule. It was not meant to be a detailed look at security technologies like cryptography, nor even a guide to what technical solutions to use. Instead, it targeted how to think about security with regard to developing IoT products.
Background
There are some buzzwords used in the talk that he wanted to define. An "IoT product" is a mass-produced consumer device that communicates over the network and whose primary purpose is not computation. Most of these devices will not use Linux as they will run on microcontrollers that are too small for Linux. Based on that definition, WiFi thermostats, networked AC plugs, and heart-rate monitors with radios all fit. Systems like laptops and smartphones do not (they are for computation), nor do ATMs, network voting machines, or nuclear command and control centers (not consumer devices—he hopes the latter aren't for sale at all).
He borrowed his definition of "securing" from Ross Anderson in his book Security Engineering. It means building systems that "remain dependable in the face of malice, error, or mischance". Security is not a binary yes/no state of a system, it is an ongoing process throughout the lifecycle of the system.
Bolivar is an embedded software engineer who works on IoT infrastructure and reference implementations for Linaro; security is a big piece of that work. Before that, he founded and ran an embedded systems company that created open-source hardware, firmware, and software, as well as doing some consulting on the side. He noted that over time more and more of his projects included networking, but that "projects with a solid security story" did not follow that trend closely, which leaves a big gap.
There is some good news, though: security engineering is a robust field with many experts who have established techniques that can be used to help secure IoT devices. The bad news is that the IoT industry is "doing it wrong"; there aren't enough of these experts to go around. The way to win is for those who are not security experts to learn and apply their ways. Those things will need to be incorporated into the product development workflow, but that can be done slowly and iteratively, he said.
There are plenty of reasons that companies should care about IoT security, but an oft-heard argument is that these problems are not something the company has to deal with. The costs associated with securing these devices are an externality in economic terms (e.g. pollution, since the cost of it is not borne by the creator). So there is resistance to spending money on security engineering at times.
But the problem is real. Major distributed denial of service attacks have stemmed from insecure IoT devices, medical devices have potentially fatal flaws, and new major IoT infrastructure (e.g. Tizen) often has many zero-day vulnerabilities to exploit. Public concern about these problems will likely result in less willingness to buy IoT products, he said. Bruce Schneier has famously called for government regulation and legislators in various places worldwide are listening. Bolivar said there is reason to believe that the externality argument does not hold much water and will hold less over time.
There are some economic concerns that might also lead companies to fund security; preventing device "cloning" is one, but a better security story can also be a product differentiator. In enterprise and business-to-business (B2B) contexts, support contracts might play a role; products that are also used internally (or might be) would also help provide an incentive to secure them. Some may simply feel that securing these devices is the "right thing to do". There may be other reasons as well, of course; whatever the reason, he would like to see companies start securing their devices—hopefully with open-source software.
Antipatterns
The most basic security antipattern is to "do nothing". That means accepting any and all risk, though. Another is to "do it yourself"; that leads to thinking the system is secure because of custom elements, such as non-peer-reviewed cryptography algorithms or implementations and security through obscurity. "Hand-rolled" security systems have not fared well over the years—developers have learned that implementing stream ciphers, for example, should not be tackled in-house. But there is still a fair amount of security by obscurity, such as "super unguessable URLs". If a product becomes successful, which is what you want, the unguessable will become all-too-guessable.
"Simon says security" is the antipattern that determines the system is secure because someone important says that it is. That can stem from vague requirements documents with sweeping security claims. It can lead to security theater (e.g. no lighters on airplanes). It tends to happen when people are panicking; they want security but aren't sure how to make it happen. But that kind of "security" does not meet Anderson's definition since it is not specifically focused any particular threat.
The next up was "just add crypto"—the system is secure because it uses cryptography. The corollary seems to be that the system is even more secure because it uses even more crypto in a cascading list of more and more acronyms (SSL, TLS, DTLS, AES, ...). Bolivar is (perhaps obviously) not saying that crypto is bad, just that it is "not magic". It is easy to misuse, implementations have bugs, key management is tricky, and so on. Adding crypto is not the end of the line for securing a device.
If the system is secure because it uses so many different security technologies, it has perhaps fallen victim to the "security grab bag" antipattern. He noted a real-life remote administration system that a friend had to use: it used a VPN, then an HTML5 remote desktop server to get a remote desktop on a system, from which SSH was used to actually log into the system of interest. In some ways, the grab bag is similar to "just add crypto". It can even work to a certain extent, depending on what's in the grab bag, but it is likely to overprotect some things, while not protecting others. It can often be a waste of resources because it does not focus on the most import threats.
An attempt to "aim for perfection" is another trap. Bolivar likened it to building a bomb-proof door before adding a window lock or trying to stop a determined nation-state level attacker before the basics are handled. This can occur when engineers get carried away in brainstorming sessions or if the people who sign off on security plans ignore "trivial matters" like deadlines and salaries.
Perfect systems never ship, so they are "tautologically secure". Any system that ships has issues, both known and unknown; security is no different. In the IoT world, it is important to remember that these devices are no longer your systems. Customers have physical access, but the problem starts even before then; contractors that build and ship the devices also have that access. Any attempt to reach perfection is likely to be seen as zealotry, which leaves a bad taste in people's mouths and reduces security buy-in.
"Release and forget" is also common. The thinking is that the system is secure, so nothing more needs to be done. Even if something does need to change, though, the build cannot be reproduced: some of the source code has gone missing or vendors won't support newer versions with needed fixes. The support window for the device may not have been specified and there may be no mechanism for people to report vulnerabilities. There may also be no way to update deployed devices at all. This "strategy" is unworkable; it means that vulnerabilities cannot be fixed, it alienates the security community that you would rather have on your side, and it antagonizes customers. But it does cost real money to ensure these things, so there have to be business reasons for a company to care.
The last antipattern he noted is the "kill the messenger" approach; sue anyone who says that the product is not secure. That includes making legal threats over vulnerability reports as well as lobbying for laws to prevent security research. Those efforts may chill research and reporting, but it will cause bad press that can damage your brand. It also antagonizes people who can sell the vulnerabilities they find (often anonymous people who are difficult to threaten or sue) .
Better patterns
Instead of adopting one or more of the above approaches, there are alternatives. To start with, devices should not connect to the network if they don't need to. Bolivar said that his management is not happy when he says that, but devices that are not connected have a much reduced attack surface. Similarly, don't collect information that is not needed; it is hard for servers or devices to give up information they never possessed to begin with.
Threat modeling is an important part of the process. Iteratively building and using threat models will result in more secure systems. It is also important to keep security planning in the normal workflow of the development process. Security bugs and features should be tracked in the same systems and factored into planning the same way as other bugs and features are. Otherwise, you can reach a point where the schedule only reflects part of the work that needs to be done, which results in long hours and slipped release dates.
The "one slide" he would like everyone to remember from his presentation (slide 31 in his slides [PDF]) reinforces the message on threat models. There are many approaches, but all involve modeling the system in question, deciding what the important problems are, and then mitigating them in priority order. It is best to start small and then iterate, Bolivar said.
For IoT, he recommended the approach laid out in Threat Modeling: Designing for Security by Adam Shostack. The book covers Microsoft's methodology, which is applicable to IoT with some tweaks. The book starts out in a fairly lightweight and easy to understand way; it describes how to improve the process as you go in an evolutionary way. It is opinionated, which he likes; it provides advice rather than just offering a bunch of different options. In addition, the methodology is "battle tested"; Microsoft has gotten much better at security over the years, he said.
There are other books and resources, of course, including the Anderson book he mentioned at the outset. That book is a great read, he said, though not all of it is applicable to IoT. It is, however, over 1000 pages long (even the bibliography is over a 100 pages, he thinks) so it is a substantial amount of reading to get started. There are also threat modeling resources from the Open Web Application Security Project (OWASP); those are focused on web applications, as the name would imply, but much of it is also applicable to IoT. The idea of "threat trees" as described by Schneier and others is useful, but somewhat hard to get started with. There are other resources listed in the slides.
As he was wrapping up (and running out of time), Bolivar gave a brief overview of the threat modeling described by Shostack. It is important to model the system with a data-flow diagram; the book focuses on software but, for IoT, it makes sense to look at the hardware as well; the schematics of the device should be consulted. Make concrete choices of what to protect and address those in a breadth-first way. Keep the antipatterns in mind as things to avoid. You will never reach the bottom of the list of protections, but that is expected; avoid "aim for perfection" and test to make sure the product is good enough to ship.
[I would like to thank the Linux Foundation for travel assistance to attend OSS in Los Angeles.]
Signing programs for Linux
At his 2017 Open Source Summit North America talk, Matthew Garrett looked at the state of cryptographic signing and verification of programs for Linux. Allowing policies that would restrict Linux from executing programs that are not signed would provide a measure of security for those systems, but there is work to be done to get there. Garrett started by talking about "binaries", but programs come in other forms (e.g. scripts) so any solution must look beyond simply binary executables.
There are a few different reasons to sign programs. The first is to provide an indication of the provenance of a program; whoever controls the key actually did sign it at some point. So if something is signed by a Debian or Red Hat key, it is strong evidence that it came from those organizations (assuming the keys have been securely handled). A signed program might be given different privileges based on the trust you place in a particular organization, as well.
Signing also provides a form of tamper resistance. It is not possible to modify a program without invalidating the signature. Package signing does not provide this assurance, however. It shows that the package was not tampered with up until it was installed on the system, after that, there is no guarantee that the contents have not changed.
There is another benefit to signing that is related to the ability to know the provenance of the code. If it is determined that a certain key has either been compromised or is signing untrustworthy programs, all trust in that key can be removed from the system. This provides a way to blacklist programs emanating from a malicious (or insecure) organization, he said.
There have been various efforts to add signatures to programs along the way, including a way to integrate signatures in ELF binaries. That particular solution does not handle all of the use cases, though. For one thing, not all programs are ELF binaries; scripts for Python and other languages are semantically equivalent to ELF binaries but are just text files on disk. Scripting languages give access to system calls and other security-sensitive facilities. There are also binaries run on Linux that are not ELF, including Windows binaries that are run under Wine and binaries for other Unix platforms. So ELF signatures are not an approach that solves the problem, Garrett said.
IMA and friends
The kernel's Integrity Measurement Architecture (IMA) is another approach. The initial implementation was "fairly straightforward", he said. It will calculate hash values for files and log them based on a configurable policy. It is not restricted to binaries and simply provides an audit trail of the hashes of files accessed.
The policies are fairly fine-grained, so IMA could hash all files executed by anyone, for example, or all files opened by the root user. If there is a malware outbreak detected on one machine, others can be checked to see if they executed something with the same hash. IMA hooks into filesystem access, so it only does the hashing the first time the file is accessed; if the file is changed, it gets rehashed when it is next accessed. But IMA itself is not signing and there is no enforcement mechanism, he said.
Signing is the process of hashing a file then encrypting the hash using some kind of key (typically the private key of a public/private key pair). In order to verify the signature, the file is hashed and the encrypted hash is decrypted using the public key. If the two hashes match, the signature is verified. IMA is a great starting point for signing programs, he said, but more is needed.
The IMA appraisal feature adds the ability to store the raw hash or a signature in the security.ima extended attribute (xattr) of a file. Most filesystems have support for xattrs and the security xattr namespace is managed by the kernel, which protects those attributes from unprivileged updates. IMA appraisal allows the creation of policies controlling what happens when hashes or signatures do not match. For example, if there is no signature or the hash in the signature does not match the hash of the file, execution can be blocked.
All of that is only useful if you have signatures associated with the files that get installed on your system. Right now, those are not really available as distributions are not shipping signatures for the files in their packages. He has been working with the Debian dpkg maintainer to add metadata to .deb files; that metadata could be used to store signatures.
Build time is an obvious place to add signatures but it doesn't have to be done then; the .deb file could be pulled apart and signed later. There are some distinct advantages to doing things that way; since build systems are pretty much required to run arbitrary code, it is best if they do not have access to the signing keys. Moving the signing to the mirroring system would remove that danger. IBM has some patches to add signature information to RPMs, but support for signing in the Debian or Fedora mirroring infrastructure has not yet been added.
That all means that we can "potentially have a future" where the files in the packages on your mirrors have signatures that could be written to the xattrs of the files as they are installed. The policy can be set so that lack of a signature means that the file cannot be executed, so there is no window for a race between installing the file and setting the xattr. Policy could be set locally (or by a distribution) to require all binaries to have signatures, or just those run by root.
In addition, IMA appraisal allows tying its policies to security labels as maintained by SELinux, AppArmor, or other Linux security modules (LSMs). So the system could be set up to require signature verification for files that have certain (high-security) labels. The appraisal can be ignored for most cases and only applied for security-critical programs and files.
It is important to note that Linux systems are rarely static; they are updated regularly (or should be), but there is more than just that. Many Linux systems, especially development machines, have locally built programs. Signing software that is locally built brings with it the danger of placing keys in harm's way.
A multi-level security scheme with multiple different security contexts might be one way to provide a useful development machine that still protects the security-critical parts of the system. Anything that wants access to the higher security contexts would require signature verification. Those in lower contexts would be blocked from accessing things like networking, D-Bus, and sensitive parts of the filesystem. Seccomp restrictions could be added as well. This would provide the best of both worlds by giving developers a functional environment while still protecting the most important and dangerous features.
The extended verification module (EVM) is meant to protect the IMA xattrs and other file attributes from offline tampering. A hash or signature of the xattr values, LSM labels, owner, group, and other metadata is stored for each file. EVM is not suitable for use by distributions, though, because one of the pieces of metadata it hashes is the inode number of the file, which is not something that is known by the distribution. EVM is geared toward systems that have temporary access to the key material; they can calculate and sign the EVM value, then lose access to the key.
Shortcomings
IMA appraisal has some shortcomings, however, Garrett said. There is no way to tie policy decisions to which key was used, so you can't set it up to do less appraisal for more-trusted keys and stronger appraisal for less-trusted keys. In addition, the action taken for appraisal failure is set at boot time, so it cannot be changed at runtime.
Beyond that, there are still problems for interpreted languages. The Python binary would likely be signed, for example, thus might be eligible to run in a higher security context. But code can just be piped to Python. That is why other operating systems are moving toward adding some awareness of security contexts into the language interpreter itself.
It turns out that Linux systems typically have many language interpreters installed, including some that may not be obvious at first glance. For example, Emacs and various media file interpreters can also execute code. Garrett has been thinking about ways to not have to modify all of these language interpreters.
One way might be to change how LSM security transitions happen. Currently, they happen when something is executed, but adding a way to taint a process based a file-open event could prevent piping code to the interpreters. The Linux pipefs could be changed to taint the processes using the pipe. For example, a command like the following:
$ curl ... | bash
The pipe would taint the bash process so that it would not have
access to higher security contexts. That still will not protect against
interpreters that take command-line arguments
(e.g. python -c), though.
In summary, Garrett said, IMA is an incredibly powerful tool, IMA appraisal goes even further, and IMA appraisal coupled with LSMs goes further still. It is not quite at the point of being deployable, but is getting close. It may even make sense for general-purpose distributions soon.
In the Q&A, James Morris pointed out that there are some complex issues regarding revoking access to resources that have already been opened, which would make Garrett's tainting idea difficult. Garrett acknowledged that, but said that after 15 years of discussing it, perhaps the kernel community should find a way to implement revoke(). Given that the process had the access from a higher security context prior to the taint, though, there may be situations where the lack of a way to revoke that access is still workable.
He was asked about a way to generalize handling language interpreters that can take code from the command line. Garrett said that each interpreter probably needs to be modified to provide a view into what code it is running. For example, the Python community is looking into that; he referred attendees to a recent LWN article about those efforts. Over time, all the different interpreters (Ruby, Lua, Perl, ...) will need to be modified to help support these features; it will take a lot of work over a long period of time, he said. There was some discussion of restricting command-line arguments to the interpreters for certain security contexts but, even if it is workable, it will take some more thinking to determine that.
Another audience member asked about how these mechanisms would work with multi-threaded programs. With a grin, Garrett responded: "wonderful question ... next question". He noted that there were some difficult problems to solve here and that the community should "not get too wrapped up in a perfect solution"; there may be good solutions that will suffice for some use cases. It is those solutions that should be pursued.
[I would like to thank the Linux Foundation for travel assistance to attend OSS in Los Angeles.]
The first half of the 4.14 merge window
As of this writing, just over 8,000 non-merge changesets have been pulled into the mainline kernel repository for the 4.14 development cycle. In other words, it looks like the pace is not slowing down for this cycle either. The merge window is not yet done, but quite a few significant changes have been merged so far. Read on for a summary of the most interesting changes entering the mainline in the first half of this merge window.Significant user-visible changes include:
- The ORC unwinder has been merged,
supporting more reliable kernel tracebacks and live patching. The
kernel also runs a bit faster when ORC is used instead of frame pointers.
- The control group thread mode patches
have been merged. This paves the way for the CPU controller to
finally appear under the version-2 interface, but that work has not
been merged yet. It may, apparently, still happen during the 4.14
development cycle.
- The AMD secure memory encryption
feature is now supported.
- The RDMA subsystem has a new user-space API based on ioctl().
This API was posted on the linux-api
list at the beginning of August but received no review comments,
perhaps because it lacks any sort of documentation. Doug Ledford
merged it, saying that "
it's encased in a Kconfig item that marks it experimental, so including it doesn't freeze it in stone
". Experience says that it could become frozen regardless of markings, though, if applications begin to depend on it; developers with an interest in this API might want to have a close look relatively soon. - The membarrier() system call has gained a new expedited option that executes more
quickly at the cost of creating inter-processor interrupts.
- The perf events subsystem continues to develop quickly; changes this
time around include branch-type profiling and tracing support, the
ability to visualize fused
instructions, initial support for
namespaces, and more; this
changelog has an overview.
- The lguest virtualization system
(which some of us still call the "rustyvisor") has been removed due to
lack of interest and maintenance.
- The x86 architecture now supports five-level page tables, allowing
processors to manage up to 128PB of virtual address space on 4PB of
physical memory. Surely nobody will ever need more memory than that.
- CPU-frequency governors can now work
across CPUs. This should lead to better power management, but
also better responsiveness when the system's load changes.
- The MSG_ZEROCOPY patches,
adding zero-copy network transmission, have been merged.
- Progress toward more scalable swapping
continues with work that delays the splitting of huge pages until after
they have been swapped out. This is not the final step (which will
be storing them as huge pages in the swap area), but it still brings a
claimed 42% improvement in swap-out throughput.
- The new MADV_WIPEONFORK option for the madvise()
system call causes the affected memory region to appear to be full of
zeros in the child process after a fork. It differs from the existing
MADV_DONTFORK in that the address range will remain valid in
the child.
- New hardware support includes:
- Audio:
Realtek RT274 codecs,
Wolfson Microelectronics WM8524 codecs, and
Cirrus Logic CS43130 codecs.
- Cryptographic:
Allwinner Security System pseudo-random number generators,
Microchip / Atmel elliptic curve crypto accelerators,
AMD secure processors,
STMicroelectronics STM32 hash accelerators,
Freescale i.MX RNGC random number generators, and
Axis ARTPEC-6/7 hardware crypto accelerators.
- Graphics:
Pervasive Displays RePaper panels,
Synopsys Designware MIPI DSI host DRM bridges,
Synopsis Designware CEC interfaces,
STMicroelectronics STM32 DSI controllers, and
Sitronix ST7586 display panels.
- Hardware monitoring:
IBM Common Form Factor power supplies,
TI TPS53679 monitoring chips, and
Lantiq CPU temperature sensors.
- Industrial I/O:
Linear Technology LTC2471 and LTC2473 analog-to-digital
controllers (ADCs),
Diolan DLN-2 ADCs,
Cirrus Logic EP93XX ADCs,
AMS CCS811 VOC sensors, and
Devantech SRF02 and SRF10 ultrasonic ranger sensors.
- Media:
Omnivision OV5670 sensors,
Analog Devices ADV748x decoders,
ST STV0910 DVB-S/S2 demodulators,
ST STV6111-based tuners,
Amlogic Meson AO CEC interfaces,
MaxLinear MxL5xx-based tuner-demodulators,
GPIO and PWM-based infrared transmitters,
ZTE ZX IR remote controls, and
AMS AS3645A LED flash controllers.
- Miscellaneous:
UniPhier AIDET interrupt controllers,
Pi433 radio modules,
Altera Arria-V/Cyclone-V/Stratix-V CvP FPGA managers,
MediaTek MT6380 power-management ICs,
MediaTek AHCI SATA controllers, and
Renesas R-Car Gen3 SDHI DMA controllers.
- Networking:
Hisilicon HNS3 Ethernet interfaces,
Realtek RTL8822BE wireless network adapters,
Adaptrum Anarion GMAC Ethernet controllers,
Mellanox Technologies MLX5 SRIOV E-Switch switches,
Rockchip Ethernet PHYs,
Huawei PCIE network interfaces, and
Marvell CP110 PHYs.
- Pin control:
UniPhier PXs3 SoC pin controllers,
NXP IMX7ULP pin controllers,
Intel Denverton and Lewisburg pin controllers,
Renesas R8A77995 pin controllers,
Spreadtrum SC9860 pin controllers,
TI TPS68470 GPIO controllers, and
Cavium ThunderX/OCTEON-TX GPIO controllers.
- USB: Ralink USB PHYs and Atheros ath10k USB controllers.
- Audio:
Realtek RT274 codecs,
Wolfson Microelectronics WM8524 codecs, and
Cirrus Logic CS43130 codecs.
- The IRDA (infrared devices) driver subsystem has been moved to the staging tree with the idea of deleting it entirely in the near future. All IRDA users should have moved to better alternatives some time ago.
Changes visible to kernel developers include:
- For anybody who has been having a hard time building the kernel
documentation: the new sphinx-pre-install script will examine
the system and list the packages that should be installed to have a
complete documentation toolchain.
- spin_unlock_wait() has been removed; its semantics were never
entirely well defined and there did not appear to be any real need for
it.
- Three architectures (ARM, ARM64, and x86) now check the value of the
user-space address limit on return from system calls; this is meant to
prevent security holes resulting from
a failure to reset that value after it is changed. A new task flag
(TIF_FSCHECK) is used to avoid slowing down the return on the
bulk of system calls that don't call set_fs().
- The lockdep cross-release feature has
been merged; this will extend automatic lock checking to several
use patterns where it was not possible before.
- The fast reference-count overflow
protection mechanism has been disabled on x86 due to some
unexplained warnings; presumably it will come back during the 4.14
cycle.
- The arm64 architecture now has support for virtually mapped kernel stacks.
- The kernel has traditionally reserved 20 major numbers for dynamic device-number assignment, but that has proved to be too few on some systems. Starting with 4.14, a new range of 128 numbers starting at 511 and progressing downward has been set aside; they will be used after the original 20 have been exhausted.
By the usual schedule, the 4.14 merge window should stay open through September 17, though that schedule has been known to vary at times. The actual 4.14 release is most likely to happen on November 5 or 12.
Running Android on a mainline graphics stack
The Android system may be based on the Linux kernel, but its developers have famously gone their own way for many other parts of the system. That includes the graphics subsystem, which avoids user-space components like X or Wayland and has special (often binary-only) kernel drivers as well. But that picture may be about to change. As Robert Foss described in his Open Source Summit North America presentation, running Android on the mainline graphics subsystem is becoming possible and brings a number of potential benefits.He started the talk by addressing the question of why one might want to use mainline graphics with Android. The core of the answer was simple enough: we use open-source software because it's better, and running mainline graphics takes us toward a fully open system. With mainline graphics, there are no proprietary blobs to deal with. That, in turn, makes it easy to run current versions of the kernel and higher-level graphics software like Mesa.
Getting the security fixes found in current kernels is worth a lot in its
own right, but up-to-date kernels also bring new features, lots of bug
fixes, better performance, and reduced power usage. The performance
and power-consumption figures for most hardware tends to improve for years
after its initial release as developers find ways to further optimize the
software. Running a fully free system increases the possibilities for
long-term support. Many devices have a ten-year (or longer) life span; if
they are running free software, they can be supported by anybody. That is,
Foss said, one of the main reasons why the GPU vendors tend not to
open-source their drivers. Using mainline graphics also makes it possible
to support multiple vendors with a single stack, and to switch vendors at
will.
At the bottom of the Android graphics stack is the kernel, of course; but the layer above that tends to be a proprietary vendor driver. That driver, like most GPU drivers, has a substantial user-space component. Android's display manager is SurfaceFlinger; it takes graphical objects from the various apps and composes them onto the screen. The interface between SurfaceFlinger and the driver is called HWC2; it is implemented by the user-space component of the vendor driver. Among other things, HWC2 implements common interfaces like OpenGL and Vulkan.
The HWC2 interface is also responsible for composing objects into the final display and implementing the abstractions describing those objects. When possible, it will offload work from the GPU to a hardware-based compositor. In the end, he said, GPUs are not particularly good at composing, so offloading that work can speed it up and save power. HWC2 is found in ChromeOS as well as in Android.
To create an open-source stack, one clearly has to replace the proprietary vendor drivers. That means providing a driver for the GPU itself and an implementation of the HWC2 API. The latter can be found in the drm_hwc (or drm_hwcomposer) project, which was originally written at Google but which has since escaped into the wider community. It is sometimes used on Android systems now, Foss said, especially in embedded settings. The manufacturers of embedded devices are finding that their long-term support needs are well met with open-source drivers.
So a free Android stack is built around drm_hwc. It also includes components like Mesa and libdrm, and it's all based on the kernel's direct rendering manager (DRM) layer. Finally, there is a component called gbm_gralloc, which handles memory allocations and associates properties (which color format is in use, for example) with video buffers.
So what is the status of this work? There are a couple of important kernel components that were prerequisites to this support; one of those is buffer synchronization, which has recently been merged. This feature allows multiple drivers to collaborate around shared buffers; it was inspired by a similar feature in the Android kernel. Some GPU drivers now have support for synchronization. The other important piece was the atomic display API; it's the only API that supports synchronization. Most drivers have support for this API at this point, which is good, since HWC2 requires it.
There are a few systems where all of this works now. The i.MX6 processor with the Vivante gc3000 GPU has complete open-source support; versions with older GPUs are not yet supported at the same level. There is support for the DragonBoard 410c with the Adreno GPU. The MinnowBoard Turbot has an Intel HD GPU which has "excellent open-source software support". Finally, the HiKey 960 is a new high-end platform; it's not supported yet but that support is "in the works".
Foss concluded by saying that support for Android on the mainline graphics stack is now a reality for a growing number of platforms. The platforms he named are development boards and such, though, so your editor took the opportunity to ask if there was any prospect for handsets with mainline graphics support in the future. Foss answered that there are "rumors" that Google likes this work and is keeping an eye on it. Time will tell whether those rumors turn into mainstream Android devices that can run current mainline kernels with blob-free graphics support.
[Thanks to the Linux Foundation, LWN's travel sponsor, for supporting your editor's travel to the Open Source Summit.]
A different approach to kernel configuration
The kernel's configuration system can be challenging to deal with; Linus Torvalds recently called it "one of the worst parts of the whole project". Thus, anything that might help users with the process of configuring a kernel build would be welcome. A talk by Junghwan Kang at the 2017 Open-Source Summit demonstrated an interesting approach, even if it's not quite ready for prime time yet.
Kang is working on a Debian-based, cloud-oriented distribution; he wanted to tweak the kernel configuration to minimize the size of the kernel and, especially, to reduce its attack surface by removing features that were not needed. The problem is that the kernel is huge, and there are a lot of features that are controlled by configuration options. There are over 300 feature groups and over 20,000 configuration options in current kernels. Many of these options have complicated dependencies between them, adding to the challenge of configuring them properly.
Kang naturally turned to the work that others have already done in an
attempt to simplify his kernel-configuration task. One interesting project
is undertaker-tailor
(also known as "the valiant little tailor"),
a project that came out of the VAMOS project. This tool
uses the ftrace
tracing mechanism to watch a kernel while the system runs a representative
workload. From the resulting traces, it concludes which parts of the
kernel are actually used, finds the configuration options controlling those
parts, then generates a
configuration that only includes the needed subsystems. This system, Kang
said, is novel, but "incomplete".
In particular, undertaker-tailor has a number of bugs; "it doesn't work and needs an overhaul". Kang tracked down and fixed some of the bugs, sending his fixes upstream in the process. The tool was badly confused by address-space layout randomization, for example. He fixed a few issues until he could get a configuration out of it. Unfortunately, the resulting kernel failed to boot. It turns out that this tool requires the user to spend some time setting whitelists and blacklists, but that brings the user back to the original configuration issue.
Another tool for trimming down a kernel configuration is the make localmodconfig command. It simply looks at the modules loaded into the running kernel and assumes that each is there because something needed it. It generates a kernel configuration that builds in those modules and leaves out the rest. This approach did create a working kernel, but that kernel was still "fat", with numerous features configured in that were not really needed.
So Kang went off to create a solution of his own. He wanted to come up with an automated system that would create a minimally sized but working kernel for his specific workload. His solution uses undertaker-tailor to collect system traces with the use cases of interest running. But then a separate "tailoring manager" runs to create the configuration from the trace data. As was the case before, this configuration is unlikely to boot and run properly. So another process works to "fill in" configuration options until the kernel eventually works.
This filling-in stage uses the localmodconfig configuration as a starting point; it thus won't fill in options that are already known not to be necessary. The first stage looks at warnings from the configuration system itself, adding options until the warnings are addressed. Then kernels are built and tested using Xnee to simulate a desktop session. There is also a hand-built blacklist used to explicitly exclude some options.
This process, which involves building and testing a lot of kernels in virtual machines, takes about five hours to run. It generates a kernel that is quite a bit smaller than what make localmodconfig provides, with almost all modules configured out. As a bonus, this kernel boots in 1/5 of the time.
Future steps include creating a larger set of workloads to be sure that all use cases for this distribution have been addressed. At some point, Kang also plans to add support for kernels running on bare metal; currently, only virtualized kernels can be configured in this way. Even now, though, he said that the resulting tool is useful for non-expert kernel users who are trying to build a kernel using something smaller than a kitchen-sink distribution configuration. Those users will have to wait, though, since Kang has not yet released this project to the world; he said he would like to do that once he receives management approval.
Postscript
Presentations of this type are often as useful for the problem they pose as for the solutions they present. In this case, it's not entirely clear that "non-expert users" will find it easier to create representative workloads that cover all needed tasks, run them with a kernel under tracing, create a suitable blacklist, and generate their final configuration. The task still seems daunting.
The problem is not Kang's solution, though; the problem is that he was driven to create such a solution just to get through the task of configuring a kernel to his needs. The kernel's configuration system is, indeed, one of the worst parts of the project. But it is also a part that nobody is really working on; it receives a bit of maintenance, but there does not appear to be any significant effort out there to address its shortcomings. Two-hundred companies support work on each kernel development cycle, but none of them see the configuration system as one of the problems that they need to solve. Until that changes, we are likely to continue to see users struggling with it.
[Thanks to the Linux Foundation, LWN's travel sponsor, for supporting your editor's travel to the Open Source Summit.]
Mongoose OS for IoT prototyping
Mongoose OS is an open-source operating system for tiny embedded systems. It is designed to run on devices such as microcontrollers, which are often constrained with memory on the order of tens of kilobytes, while exposing a programming interface that provides access to modern APIs normally found on more powerful devices. A device running Mongoose OS has access to operating system functionality such as filesystems and networking, plus higher-level software such as a JavaScript engine and cloud access APIs.
Mongoose OS is not meant to compete in the solution space occupied by Linux and its ilk; instead it is a cross-platform Internet of Things (IoT) development toolkit for tiny devices that aren't powerful enough to run a full-featured operating system. Unlike embedded operating systems like Zephyr, Mongoose OS does not implement the entire OS stack. Instead, it relies on either the underlying hardware's SDK or a small realtime operating system like FreeRTOS to provide most of the low level access. It builds upon the underlying OS primitives to provide higher-level APIs for programmers that the embedded operating systems don't normally provide.
There is a wide range of embedded hardware specifically for IoT use; software toolkits to run on them include ARM Mbed and Arduino. Both are open-source toolkits but they are intrinsically tied to their respective hardware platforms. Mongoose OS is cross-platform and delivers a consistent programming interface across all of the different devices it supports. The goal of Mongoose OS is to give IoT developers as much software scaffolding as possible to rapidly prototype applications for their platforms. This includes support for over-the-air (OTA) updates, encryption, remote management, as well as easy-to-use APIs to control IoT devices, to get data from them, and to funnel that data to a cloud-backed service.
Once in the cloud, the data can be processed by analytics or used by any other web service. The remote procedure call (RPC) features accessible via a web services API makes devices running Mongoose OS controllable via the cloud. One can imagine industrial or scientific installations operating an array of sensors and other data collecting devices run by microcontrollers that pump the data they collect into the cloud to be retrieved or analyzed by users. Mongoose OS is one option to build the layer of software that is required to have an interface between the hardware with its users.
Architecture
Mongoose OS provides its own networking and some peripheral drivers as well as higher-level abstractions for IoT programming. There is a network library, a filesystem layer, and an API to access hardware-specific functionality such as GPIO, I2C, SPI, UART, and others. There is also a library to access cloud-based services such as Amazon AWS IoT Platform, Microsoft Azure IoT suite, Google IoT Core, Adafruit IO, Samsung ARTIK Cloud, and Blynk. You can make generic HTTP calls to implement an API for other web services. A virtual filesystem layer allows the mounting of different filesystems, but currently the only supported one is the SPI Flash File System (SPIFFS), which is a file system intended for flash-memory-based storage.
Mongoose OS can run on ESP32 and ESP8266 microcontrollers by Espressif Systems or the Texas Instruments CC3200. There is partial support for nRF52 by Nordic Semiconductor and the STM32 by STMicroelectronics, and a port for the Texas Instruments CC3220 is almost complete. On the ESP32 and CC3200, Mongoose OS runs on top of FreeRTOS, which provides task scheduling and primitives such as semaphores and queues. On the more resource-constrained ESP8266, Mongoose OS uses the SDK provided by Espressif Systems.
Embedded together with the system is a tiny JavaScript engine called mJS that implements a subset of the language; API calls can be made from C, JavaScript, or a mix of both. A developer can directly call C functions from JavaScript using the foreign function interface call, ffi(). The limitations of mJS are that there is no standard library, closures, exceptions, or Unicode strings, and there are some syntax restrictions. However, the JSON.parse() and JSON.stringify() calls are available, which is handy for JSON object manipulation that is often needed when talking to web services.
Mongoose OS supports SPI flash encryption for securing IoT devices. On the ESP32, Mongoose OS can take advantage of the built-in encryption hardware. It also supports the ATECC508A crypto chip, which can be added to a device. These mechanisms allow the encryption of the flash memory to prevent access to any of the data in the event an attacker has physical access to the devices. There is an RPC mechanism for OTA updates of the firmware; device developers can configure their own server to update from.
Libraries and apps
There is a repository of libraries provided by the Mongoose OS developers that a user can download from to help build their application. Using these libraries, a program can write to or read from hardware connections such as GPIO pins and route the data over the network. There is a compatibility layer for the use of Arduino libraries, which should give users more options.
An app is a Mongoose OS program built and flashed onto the device. There is a repository of community-contributed apps for reference. Most of them are sample applications that show how to use a certain library or hardware feature. For example, one clever hack is the automatic heater control built using a NodeMCU running Mongoose OS, I2C temperature sensors, and an AWS account.
Licensing
Mongoose OS is an open-source product of Cesanta, a company founded by Sergey Lyubka in Ireland to develop and market the operating system. There are two classes of Mongoose OS users: hobbyists and commercial customers. Hobbyists will find a fun and easy operating system to bootstrap their home IoT projects quickly. The system is available under the terms of the GPLv2; the company asserts that user programs that link to it also need to be released under a GPLv2-compatible license. However, if there are users that prefer a different licensing model, Cesanta offers alternative arrangements for paying customers. There is an open-source community around Mongoose OS, but all external code contributors must grant Cesanta the right to relicense the code.
Trying it out
Mongoose OS is available from the project's download page, which offers a tool called mos to build and deploy the system on the user's embedded hardware. On Ubuntu, a convenient PPA is available for version 16.04 and up. However to run the OS itself, one of the supported boards is required as there is no emulator that works with it.
To try out Mongoose OS, I purchased a ESP8266-powered NodeMCU. It is a microcontroller that includes an on-board WiFi chip. I connected it via USB to my Ubuntu machine and it was detected as a serial USB device. I added the Mongoose OS PPA and used the package manager to install the mos management utility, which allows a user to flash a supported hardware module with Mongoose OS; the package also includes some development tools.
Running mos brings up a web interface on the default system browser that is connected to a tiny web server that the management tool runs on your host machine. The web interface can be used to set up the connected device, which is a simple three-step process: specify the serial USB interface it is connected on, identify the attached chip and flash the OS image, and specify the wireless access point information for the device to connect to.
After the initial set up, the web interface gives the user an integrated development environment (IDE) that allows projects to be created, to import applications or libraries, and to be built and flashed onto the device. I used the Blynk sample app from the repository to try out the cloud features. Blynk is a cloud service that allows a user to control a network-connected embedded system via a proprietary mobile app, which is available for free download but has in-app purchases of certain features. However, the Blynk library and sample app on the device side is open source; flashing it to the device was a straightforward process. Using the mobile app, I could turn on and off the LED on the board. It was a quick way to prototype an IoT application; a user can be up and running in minutes.
Conclusion
For anyone tinkering with Internet of Things programming, Mongoose OS takes the hassle out of interfacing with the low-level hardware for some tiny embedded devices. The range of hardware it supports is limited, but the boards that it does support are quite popular. The repository of apps and libraries for Mongoose OS is also still in its infancy. The features roadmap does include more hardware support, better cloud integration, and the ability to mount SD cards. Being able to set up and program an IoT device easily should be an attractive option for hobbyists and commercial developers looking for rapid prototyping.
[I would like to thank the Mongoose OS developers and community for help in clarifying aspects of the project.]
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Briefs: BlueBorne; Struts on Equifax breach; GNOME 3.26; LXC 2.1; Quotes; ...
- Announcements: Newsletters, events, security updates, kernel patches, ...
