KS2012: ARM: AArch64
Catalin Marinas led a discussion of kernel support for 64-bit ARM processors as part of day two of the ARM minisummit. He concentrated on the status of the in-flight patches to add that support, while pointing to his LinuxCon talk later in the week for more details about the architecture itself.
A second round of the ARM-64 patches was posted to the linux-kernel mailing list in mid-August. After some complaints about the "aarch64" name for the architecture, it was changed to "arm64", at least for the kernel source directory. That name will really only be seen by kernel developers as uname will still report "aarch64", in keeping with the ELF triplet used by the binaries built with GCC.
Some of the lessons learned from the ARM 32-bit support have been reflected in arm64. It will target a single kernel image by default, for example. That means that device tree support is mandatory for AArch64 platforms. Since there are not, as yet, any AArch64 platforms, the patches contain simplified platform code based on that of the Versatile Express.
There are two targets for AArch64 devices: embedded and server. It is possible that ACPI support will be required for the servers. As far as Marinas knows, there is no ACPI implementation out there, but it is not clear what Microsoft is doing in that area.
The code for generic timers and the generic interrupt controller (GIC) lives under the drivers directory. That code could be shared with arch/arm, but there is a need to #ifdef the inline assembly code.
There is an intent to push back on the system-on-a-chip (SoC) vendors regarding things like firmware initialization, boot protocol, and a standardized secure mode API. SoC vendors (and thus, their ARM sub-trees) should be providing the standard interfaces, rather than heading out on their own. The ARM maintainers can choose not to accept ports that do not conform.
That may work for devices targeted at Linux, but there may be SoC vendors who initially target another operating system, as Olof Johannson noted. There will likely need to be some give and take for things such as the boot protocol when Windows, iOS, or OS X targeted devices are submitted. Marinas said that the aim would be for standardization, but they "may have to cope" with other choices at times.
The first code from SoC vendors is not expected before the end of the year, Marinas said. Arnd Bergmann half-jokingly suggested that he would be happy to get a leaked version of that code at any time. The first SoCs might well just be existing 32-bit ARMv7 SoCs with an AArch64 CPU (aka ARMv8) dropped in. That may be the path for embedded applications, though the vendors targeting the server market are likely to be starting from scratch.
That led to a discussion of how to push the arm64 patches forward. Marinas would like to push the core architecture code forward, while working to clean up the example SoC code. He would like to target the 3.8 kernel for the core. Bergmann was strongly in favor of getting it all into linux-next soon, and targeting a merge for the 3.7 development cycle.
Marinas is concerned that including the SoC code will delay inclusion as it will require more review. He also wants to make sure that there is a clean base for those who want to use it as a basis for their own SoC code. That should take two weeks or so, Marinas said. He hopes to get it into linux-next sometime after 3.7-rc1, but Bergmann encouraged a faster approach. There is nothing very risky about doing so, Johannson pointed out, as a new architecture cannot break any existing code.
There is some concern about the 2MB limit on device tree binary (dtb) files because some network controllers (and other devices) may have firmware blobs larger than that. Bergmann noted that those blobs may not be able to be shipped in the kernel, but could be put into firmware and loaded from there. It turns out that the flattened device tree format already has a length entry in its header that can be used to support multiple dtbs, which will allow the 2MB limit to be worked around.
The existing arm64 emulation does not have any DMA, so support for that feature is currently untested. In addition, some SoCs are likely to only support 32-bit DMA. Bergmann suggested an architecture-independent implementation that used dma_ops pointers to provide both coherent and non-coherent versions, but Marinas would like to do something simpler (i.e. coherent only) to start with. Since the "hardware" currently lacks DMA, "all DMA is coherent" seems like a reasonable model, Bergmann said. Since no one will be affected by any bugs in the code, he suggested getting it into linux-next as soon as possible.
Tony Lindgren asked if ARM maintainer Russell King had any comments on the patches. Marinas said that there were not many, at least so far. Bergmann said that he didn't think King was convinced that having a separate arm64 directory (as opposed to adding 64-bit support to the existing arm directory) was the right approach.
Many of the decisions were made for ARM 15 years ago, Marinas said, and some of those make it messy to drop arm64 on top of arm. Some day, when the arm tree only supports ARMv7, it may make sense to merge with arm64. The assembly code cannot be shared, because they are two different architectures, Bergmann said. In addition, the system calls cannot be shared and the platform code is going to be done very differently for arm64, he said.
But, there is room for sharing some things between the two trees, Marinas said. That includes some of the device tree files, perf, the generic timer, the GIC driver code, as well as KVM and Xen if and when they are merged. In theory, the ptrace() and signal-handling code could be shared as well.
Progress is clearly being made for arm64, and we will have to wait and see how quickly it can make its way into the mainline.
