Running Android on a mainline graphics stack
LWN.net needs you!The Android system may be based on the Linux kernel, but its developers have famously gone their own way for many other parts of the system. That includes the graphics subsystem, which avoids user-space components like X or Wayland and has special (often binary-only) kernel drivers as well. But that picture may be about to change. As Robert Foss described in his Open Source Summit North America presentation, running Android on the mainline graphics subsystem is becoming possible and brings a number of potential benefits.Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing.
He started the talk by addressing the question of why one might want to use mainline graphics with Android. The core of the answer was simple enough: we use open-source software because it's better, and running mainline graphics takes us toward a fully open system. With mainline graphics, there are no proprietary blobs to deal with. That, in turn, makes it easy to run current versions of the kernel and higher-level graphics software like Mesa.
Getting the security fixes found in current kernels is worth a lot in its
own right, but up-to-date kernels also bring new features, lots of bug
fixes, better performance, and reduced power usage. The performance
and power-consumption figures for most hardware tends to improve for years
after its initial release as developers find ways to further optimize the
software. Running a fully free system increases the possibilities for
long-term support. Many devices have a ten-year (or longer) life span; if
they are running free software, they can be supported by anybody. That is,
Foss said, one of the main reasons why the GPU vendors tend not to
open-source their drivers. Using mainline graphics also makes it possible
to support multiple vendors with a single stack, and to switch vendors at
will.
At the bottom of the Android graphics stack is the kernel, of course; but the layer above that tends to be a proprietary vendor driver. That driver, like most GPU drivers, has a substantial user-space component. Android's display manager is SurfaceFlinger; it takes graphical objects from the various apps and composes them onto the screen. The interface between SurfaceFlinger and the driver is called HWC2; it is implemented by the user-space component of the vendor driver. Among other things, HWC2 implements common interfaces like OpenGL and Vulkan.
The HWC2 interface is also responsible for composing objects into the final display and implementing the abstractions describing those objects. When possible, it will offload work from the GPU to a hardware-based compositor. In the end, he said, GPUs are not particularly good at composing, so offloading that work can speed it up and save power. HWC2 is found in ChromeOS as well as in Android.
To create an open-source stack, one clearly has to replace the proprietary vendor drivers. That means providing a driver for the GPU itself and an implementation of the HWC2 API. The latter can be found in the drm_hwc (or drm_hwcomposer) project, which was originally written at Google but which has since escaped into the wider community. It is sometimes used on Android systems now, Foss said, especially in embedded settings. The manufacturers of embedded devices are finding that their long-term support needs are well met with open-source drivers.
So a free Android stack is built around drm_hwc. It also includes components like Mesa and libdrm, and it's all based on the kernel's direct rendering manager (DRM) layer. Finally, there is a component called gbm_gralloc, which handles memory allocations and associates properties (which color format is in use, for example) with video buffers.
So what is the status of this work? There are a couple of important kernel components that were prerequisites to this support; one of those is buffer synchronization, which has recently been merged. This feature allows multiple drivers to collaborate around shared buffers; it was inspired by a similar feature in the Android kernel. Some GPU drivers now have support for synchronization. The other important piece was the atomic display API; it's the only API that supports synchronization. Most drivers have support for this API at this point, which is good, since HWC2 requires it.
There are a few systems where all of this works now. The i.MX6 processor with the Vivante gc3000 GPU has complete open-source support; versions with older GPUs are not yet supported at the same level. There is support for the DragonBoard 410c with the Adreno GPU. The MinnowBoard Turbot has an Intel HD GPU which has "excellent open-source software support". Finally, the HiKey 960 is a new high-end platform; it's not supported yet but that support is "in the works".
Foss concluded by saying that support for Android on the mainline graphics stack is now a reality for a growing number of platforms. The platforms he named are development boards and such, though, so your editor took the opportunity to ask if there was any prospect for handsets with mainline graphics support in the future. Foss answered that there are "rumors" that Google likes this work and is keeping an eye on it. Time will tell whether those rumors turn into mainstream Android devices that can run current mainline kernels with blob-free graphics support.
[Thanks to the Linux Foundation, LWN's travel sponsor, for supporting your
editor's travel to the Open Source Summit.]
| Index entries for this article | |
|---|---|
| Kernel | Android |
| Conference | Open Source Summit North America/2017 |
Posted Sep 13, 2017 3:09 UTC (Wed)
by Tara_Li (guest, #26706)
[Link] (10 responses)
I find this idea somewhat interesting - how much more ooomph is needed to do the composing, and why don't GPUs have that bit built in? After all, if you're going to need a separate unit to do GPU work, then another to do composing of everything the GPU generates, you're going to need another bus with hella bandwidth to get A to B.
And are there going to be more stages that get offloaded from the GPU - I know for a time, there were separate "physics engines" you could buy to offload some of *that* from the CPU/GPU - collision detection, flight of debris, etc...
Posted Sep 13, 2017 5:43 UTC (Wed)
by linusw (subscriber, #40300)
[Link] (2 responses)
Posted Sep 13, 2017 6:26 UTC (Wed)
by zyga (subscriber, #81533)
[Link]
The hardware I used to deal with ~15 years could handle one video and one bitmap layer. Later on we got more and more features, two video layers (one full features with better de-interlace and scaling features and one limited for picture-in-picture), additional layers arbitrary graphics for some nicer blending possibilities. All of this was on hardware that could not do any openGL.
Unfortunately none of that had sane drivers. At the time each vendor provided their own libraries to configure and use the video stack. Nowadays the problem is less visible because we get those speedy CPUs and even integrated graphics has a lot to offer but I suspect, if available and used correctly, we could save some power in idle-desktop / watching-video use cases.
Posted Sep 13, 2017 20:24 UTC (Wed)
by rvfh (guest, #31018)
[Link]
I am no expert so if you know better feel free to correct me!
Posted Sep 13, 2017 7:55 UTC (Wed)
by daniels (subscriber, #16193)
[Link]
Posted Sep 13, 2017 12:03 UTC (Wed)
by excors (subscriber, #95769)
[Link] (4 responses)
For display composition you don't want to write to memory at all - ideally you'd 'render' each pixel in raster order just as it's about to be sent out of the the HDMI port (or equivalent), and then you save all the latency and power cost of writing to DRAM in the GPU then reading it back in the display controller.
Usually a phone isn't doing much 3D GPU stuff, it's just displaying a few static images (status bar, app UI) and perhaps a decoded video, and the "rendering" is just some colour conversion and scaling and alpha-blending, so it's easy to do in raster order.
(In practice you'd probably render a few lines at once and store them in on-chip memory until they're sent out to the display, to tolerate some jitter in the rendering speed, but that's only a few KBs of memory so it's fast and cheap. You still get timing problems if e.g. you try to alpha-blend too many planes at once and the compositor fills the line buffer more slowly than the display consumes it, in which case you probably have to fall back to expensive OpenGL composition to avoid display glitches, and you need clever drivers to decide exactly when and how to fall back.)
As far as I'm aware, all modern mobile SoCs (except maybe the absolute cheapest terrible ones) have special hardware to do that, though they all do it with significantly different feature sets and are completely unrelated at the kernel level; the only standardisation is that they all implement the Android HWC HAL.
> if you're going to need a separate unit to do GPU work, then another to do composing of everything the GPU generates, you're going to need another bus with hella bandwidth to get A to B.
I think A and B are the same place. Mobile SoCs don't have dedicated VRAM like in discrete desktop GPUs - OpenGL will render to a framebuffer in the shared system DRAM, alongside all the other static images and decoded videos etc, and the compositing hardware will read all those layers straight from DRAM as it needs them.
> And are there going to be more stages that get offloaded from the GPU
Plenty have already - some chips used their GPU for video encoding, camera image processing, etc, and tend to move it into dedicated hardware eventually (to save power and improve performance). Vendors who don't have dedicated hardware for some feature argue strongly that their GPU is great and efficient and there's no need for dedicated hardware, and then a couple of years later their new chip moves that feature into dedicated hardware and they say how great it is now. CV algorithms and neural nets seem likely to be the next features to follow that pattern.
Posted Sep 13, 2017 21:33 UTC (Wed)
by roc (subscriber, #30627)
[Link] (3 responses)
Posted Sep 13, 2017 22:23 UTC (Wed)
by excors (subscriber, #95769)
[Link] (2 responses)
(Proper non-stripped-down VC4, like in Raspberry Pi, does compositing with HVS <https://dri.freedesktop.org/docs/drm/gpu/vc4.html>. The mainline vc4 driver uses that to implement DRM atomic mode setting.)
Posted Sep 14, 2017 4:33 UTC (Thu)
by roc (subscriber, #30627)
[Link]
Posted Sep 15, 2017 0:36 UTC (Fri)
by anholt (guest, #52292)
[Link]
Posted Sep 13, 2017 13:03 UTC (Wed)
by mjthayer (guest, #39183)
[Link]
> I find this idea somewhat interesting - how much more ooomph is needed to do the
I thought that embedded GPUs tended to use main RAM directly rather than dedicated video memory. If that is what you meant.
Posted Sep 13, 2017 4:07 UTC (Wed)
by bojan (subscriber, #14302)
[Link]
Surely, you meant infamously there Jon. :-)
Posted Sep 13, 2017 22:01 UTC (Wed)
by bero (guest, #89787)
[Link] (1 responses)
Posted Sep 14, 2017 17:57 UTC (Thu)
by rahvin (guest, #16953)
[Link]
Hell, they'd probably replace the kernel with something in house if they could, oh wait they are already moving that direction.
Posted Sep 15, 2017 16:56 UTC (Fri)
by markjanes (guest, #58426)
[Link]
Running Android on a mainline graphics stack
> so offloading that work can speed it up and save power.
Running Android on a mainline graphics stack
On the ST-Ericsson ill-fated U8500 we had a hardware block called "B2R2" which reads "blit, blend, rotate and rescale", which is what compositors need. I vaguely recall that the TI OMAP had something similar. (Maybe someone can fill in?)
If there is a mainline kernel-to-userspace abstraction for these engines is another question. I think at the time it was made into a custom character device and used directly from what is now HWC2.
Running Android on a mainline graphics stack
OMAP DSS
Running Android on a mainline graphics stack
Running Android on a mainline graphics stack
Running Android on a mainline graphics stack
Running Android on a mainline graphics stack
Running Android on a mainline graphics stack
Running Android on a mainline graphics stack
Running Android on a mainline graphics stack
>> so offloading that work can speed it up and save power.
> composing, and why don't GPUs have that bit built in? After all, if you're going to need
> a separate unit to do GPU work, then another to do composing of everything the GPU
> generates, you're going to need another bus with hella bandwidth to get A to B.
Fame
Running Android on a mainline graphics stack
Given the Nexus 5, 6, 5X and 6P as well as the Pixels use Adreno GPUs as well, it should be doable for those devices as well (and yes, experiments are underway).
Running Android on a mainline graphics stack
Running Android on a mainline graphics stack
https://01.org/android-IA
