| Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net. |
The Intelligent Platform Management Interface (IPMI) is a set of system-management-and-monitoring APIs typically implemented on server motherboards via an embedded system-on-chip (SoC) that functions completely outside of the host system's BIOS and operating system. While it is intended as a convenience for those who must manage dozens or hundreds of servers in a remote facility, IPMI has been called out for its potential as a serious hole in server security. At the 2016 Embedded Linux Conference in San Diego, Tian Fang presented Facebook's recent work on OpenBMC, a Linux distribution designed to replace proprietary IPMI implementations with an open-source alternative built around standard facilities like SSH.
To recap, IPMI SoCs are known as baseboard
management controllers (BMCs). The BMC is connected to most of
the standard buses on the motherboard, so it can monitor temperature
and fan sensors, storage devices and expansion cards, and even access
the network (through its own virtual network interface that includes a
separate MAC address). But BMCs almost invariably ship with a
proprietary IPMI implementation in binary-blob form (though, in most
cases, that blob is running on Linux), which is limited
in functionality to what the vendor chooses. Furthermore, as Matthew
Garrett outlined quite memorably at
linux.conf.au in 2015, IPMI is riddled with poor security and, thus,
leaves servers vulnerable to all sorts of attacks. Once the BMC has
been compromised, the attacker has direct access to essentially every
part of the server.
Most server vendors, Fang said at the start of his talk, provide their own BMCs. The "white box" server or generic server motherboard markets, however, are dominated by ARM-based BMC SoCs from ASPEED. So when Facebook started working on the latest machine design in its Open Compute Project (OCP)—which creates and releases open hardware designs for white-box data-center machines—it went with an ASPEED BMC as well. But, in keeping with OCP's philosophy, it decided to build its own Linux distribution to run on the BMC, rather than buy a proprietary IPMI stack.
The resulting distribution is called OpenBMC. It was first deployed in 2013 with the Wedge, a "top-of-rack network switch" design from OCP, although Fang noted that, from a design standpoint, the team looked at the switch as "just another server." OpenBMC has subsequently been deployed on several of Wedge's successor designs.
Currently, OpenBMC runs on only a few ASPEED SoCs, and Fang pointed out that the distribution is tailored to the specific OCP machines used by the project, even though similar SoCs may be found in non-OCP servers. The supported SoCs from ASPEED include the common AST2300 and AST2400 (both operating at 400MHz), plus the newer AST2500 (at 800MHz and adding support for PCIe Host) that is now being tested (but has not yet been put into production servers). All of these SoCs come with a large set of GPIO pins (232 in the AST2500) and multiple independent I2C controllers, so that they can monitor many hardware features simultaneously. But, Fang said, OpenBMC provides "standard Linux server" tools rather than a traditional IPMI interface; it exposes those hardware-monitoring features through lm-sensors. Administrators can connect to the OpenBMC instance using key-based SSH authentication, rather than IPMI's authentication scheme, and Facebook has written a REST-ful web service for monitoring multiple machines remotely. Among the other services included are DHCP, Link Layer Discovery Protocol (LLDP) utilities, Python, and various hardware-monitoring tools.
Fang then outlined the process of porting OpenBMC to a new board. The system uses the U-Boot bootloader and an embedded Linux distribution created with Yocto. The Yocto build includes three layers. The first is a common layer (meta-openbmc) that defines all of the packages used in OpenBMC. On top of that comes a general BMC layer that enables the bootloader, C library, and kernel recipes for a particular SoC vendor (at present, meta-aspeed). At the top comes a machine-specific layer (e.g., meta-wedge) containing the specific kernel, bootloader, and sensor-software configurations, and also enables the recipes for Facebook's custom user-space monitoring programs. For example, meta-aspeed includes recipes that define the GPIO tables for the various ASPEED SoCs, while meta-wedge configures each of those GPIOs correctly for the Wedge server (which pins are voltage monitors, which are used for fan control, and so on).
He showed several example recipes (which are available in the session slides [PDF]), including a GPIO configuration, an lm-sensors configuration, and a libpal example. Libpal is a network-packet assembly library; Facebook uses it to generate a small set of IPMI messages to provide compatibility with other monitoring tools. He also briefly discussed the REST-ful web service developed by Facebook. The feature set does not veer much (if any) from the ordinary—administrators connect and request server status, which is returned in JSON—although he did point out that the service works with cURL and can, therefore, be scripted, in addition to working through the browser. The notable feature, of course, is that the service runs on the BMC rather than on the server's CPU.
There are currently thousands of OCP Wedge switches (and its successors) in Facebook data centers running OpenBMC, Fang said. OpenBMC allows Facebook to run more recent kernel releases than one would typically find in a vendor-provided IPMI stack; the latest deployments use 4.1. That said, upgrading a BMC involves flashing a new root filesystem, kernel, and bootloader to the SoC's EEPROM, a process that can occasionally go awry. But the BMC on the Wedge and other OCP switches sits in a removable socket, so it can be pulled and replaced if necessary.
Fang cited two issues as the focus of current OpenBMC development. The first is driver stability. Vendor board-support packages, he said, typically focus only on board bring-up, while thousands of machines running different workloads encounter a wide variety of bugs. "When you deploy thousands of machines every 'edge case' becomes common." The second is improving the tooling provided in OpenBMC. He noted that more development tools are being ported to the distribution, along with the Chef provisioning system and additional monitoring programs.
The session ended with a brief question-and-answer period. Fang reported that, so far, no non-ASPEED BMCs are supported, but that the OpenBMC team would love to work with any BMC company that was interested. He also said that Facebook has started work on a BMC development board that will hopefully serve as a low-cost entry point for interested developers.
But the biggest question was whether Facebook would be pushing any of its OpenBMC work into the mainline kernel. From the audience, Grant Likely mentioned that Linaro would like to see patches, to which Fang replied that the OpenBMC team is interested, but that it does not consider most of the code to be in a ready-to-upstream, fully working state. ARM SoC tree co-maintainer Arnd Bergmann, also in the audience, encouraged Fang and others working on OpenBMC not to wait for some point when everything is "complete," but to start sending in patches incrementally, even if they will require additional revisions and effort. "'Working' has never been a requirement for upstream," he said, which the audience found amusing. "The main thing we want to see is that the code is maintainable and that someone is interested in working on it."
Fang seemed agreeable to that suggestion, so perhaps it is only a matter of time before the mainline kernel will run on at least some portion of the BMCs found in today's data centers and network closets. It may only be a start, and only support one BMC vendor, but that itself is still noteworthy progress in comparison to the unappealing security and software-freedom issues that plague so many systems via the IPMI framework.
[The author would like to thank the Linux Foundation for travel assistance to attend ELC 2016.]
OpenBMC, a distribution for baseboard management controllers
Posted Apr 12, 2016 23:56 UTC (Tue) by flewellyn (subscriber, #5047) [Link]
The music fan in me is mildly disappointed, however, that they didn't call the project "Run-BMC".
OpenBMC, a distribution for baseboard management controllers
Posted Apr 13, 2016 0:21 UTC (Wed) by rahvin (subscriber, #16953) [Link]
I'd love to be able to ssh -x and get the access to the boot bios via a window with full KVM support and the ability to soft/hard reboot etc. There would no doubt be some nice support contracts for something like this if a company kept it all up to date with the latest security exploits.
What a world where BMC's aren't a security threat on nearly every system.
OpenBMC, a distribution for baseboard management controllers
Posted Apr 13, 2016 1:07 UTC (Wed) by pabs (subscriber, #43278) [Link]
OpenBMC, a distribution for baseboard management controllers
Posted Apr 13, 2016 6:05 UTC (Wed) by benh (subscriber, #43720) [Link]
OpenBMC, a distribution for baseboard management controllers
Posted Apr 13, 2016 1:07 UTC (Wed) by dskoll (subscriber, #1630) [Link]
I would like to second and third the previous comments. The proprietary BMC controllers I've worked with (with the possible exception of Dell's iDRAC) are garbage. I have a server 200km away whose BMC is hung and cannot be restarted even with a hard reboot command from ipmitool.
I'm going to have to power that sucker down by physically unplugging the power cords and hope the BMC comes back, and I can't do that until I schedule a visit to the data centre. Grrrrrr....
OpenBMC, a distribution for baseboard management controllers
Posted Apr 13, 2016 4:41 UTC (Wed) by raven667 (subscriber, #5198) [Link]
OpenBMC, a distribution for baseboard management controllers
Posted Apr 13, 2016 22:13 UTC (Wed) by jhhaller (subscriber, #56103) [Link]
OpenBMC, a distribution for baseboard management controllers
Posted Apr 14, 2016 2:37 UTC (Thu) by mebrown (subscriber, #7960) [Link]
I wonder how hard it would be to pull in the openbmc layer?
OpenBMC, a distribution for baseboard management controllers
Posted Apr 13, 2016 6:04 UTC (Wed) by benh (subscriber, #43720) [Link]
We have rewritten the ASpeed SoC kernel pretty much from scratch and cleaned up a bunch of the drivers,
made it device-tree based etc... and are about to start submitting it all upstream.
OpenBMC, a distribution for baseboard management controllers
Posted Apr 13, 2016 13:59 UTC (Wed) by k3ninho (subscriber, #50375) [Link]
K3n.
OpenBMC, a distribution for baseboard management controllers
Posted Apr 13, 2016 21:16 UTC (Wed) by iabervon (subscriber, #722) [Link]
OpenBMC, a distribution for baseboard management controllers
Posted Apr 13, 2016 23:29 UTC (Wed) by Creideiki (subscriber, #38747) [Link]
You are familiar with psDooM, I assume?
OpenBMC, a distribution for baseboard management controllers
Posted Apr 14, 2016 11:17 UTC (Thu) by k3ninho (subscriber, #50375) [Link]
K3n.
OpenBMC, a distribution for baseboard management controllers
Posted Apr 13, 2016 21:24 UTC (Wed) by benh (subscriber, #43720) [Link]
OpenBMC, a distribution for baseboard management controllers
Posted Apr 14, 2016 1:30 UTC (Thu) by mebrown (subscriber, #7960) [Link]
Response times are more a matter of good engineering of the userspace apps.
OpenBMC, a distribution for baseboard management controllers
Posted Apr 15, 2016 12:28 UTC (Fri) by k3ninho (subscriber, #50375) [Link]
K3n.
OpenBMC, a distribution for baseboard management controllers
Posted Apr 15, 2016 14:28 UTC (Fri) by raven667 (subscriber, #5198) [Link]
OpenBMC, a distribution for baseboard management controllers
Posted Apr 15, 2016 16:22 UTC (Fri) by k3ninho (subscriber, #50375) [Link]
Now, if we're going by what I wrote: I said 'I'd prefer to have BMC's with simpler capabilities to ensure robust response times' which needs anyone reading to infer that not running eBPF programs in full-Linux or having a simpler capabilities means a dedicated RT kernel is necessary to ensure robust response times. That's a stretch, expecially when I intended to convey other meanings, including things like under-committing resources to ensure spare overhead. Whether that's done in by an RT kernel or multi-user, pre-emptive multitasking kernel is something I wanted to leave open for someone else to say what and why they chose that option.
K3n.
OpenBMC, a distribution for baseboard management controllers
Posted Apr 26, 2016 8:51 UTC (Tue) by hailfinger (guest, #76962) [Link]
Copyright © 2016, Eklektix, Inc.
This article may be redistributed under the terms of the
Creative
Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds