Building the kernel with clang
By Jake Edge
September 19, 2017
Linux Plumbers Conference
Over the years, there has been a persistent effort to build the Linuxkernel using the Clang C compiler that is part of the LLVM project. Welast looked in on the effort in areport fromthe LLVM microconference at the 2015 Linux Plumbers Conference (LPC), but wehave followed itbefore that aswell. At this year's LPC, two Google kernel engineers, Greg Hackmann andNick Desaulniers, came to theAndroidmicroconference to update the status; at this point, it is possible tobuild two long-term support kernels (4.4 and 4.9) with Clang.
Desaulniers began the presentation by answering the most commonly askedquestion: why build the kernel with Clang? To start with, the Android userspace is all built with Clang these days, so Google would like to reducethe number of toolchains it needs to support. He acknowledged that it isreally only a benefit to Google and is "not super useful" elsewhere. Butthere are other reasons that are beneficial to the wider community.
There are some common bugs that often pop up in kernel code, especiallyout-of-tree code like the third-party drivers that end up in Androiddevices. The developers are interested in using the static analysisavailable in Clang to spotthose bugs, but the kernel needs to be built using Clang to do so. Thereare also a number of dynamic-analysis tools that can be used like thevarioussanitizers(e.g.AddressSanitizeror ASan) and their kernel equivalents (e.g. KernelAddressSanitizer orKASAN).
Clang provides a different set of warnings than GCC does; looking at thosewill result in higher quality code. It is clearly beneficial to all kernelusers to have fewer bugs in it. There are some additional tools that areplanned using Clang. One is a control-flow-analysis tool that couldenumerate valid stack frames at compile time; those could be checked atrun time to eliminatereturn-orientedprogramming (ROP) attacks. There is also work going on for link-timeoptimization (LTO) and profile-guided optimization (PGO) for Clang, which couldprovide better execution speed, especially for hot paths.
Building code with another compiler is a good way to shake out code thatrelies on undefined behaviors. Since the language specification does notdefine certain behaviors, compiler developers can choose whatever isconvenient. That choice could change, so even a GCC upgrade might causemisbehavior if some kernel code is relying on undefined behavior. Thehope, Desaulniers said, is that both the kernel and LLVM/Clang can improvetheir code bases from this effort. The kernel is a big project with a lotof code that can find bugs in the compiler; in fact, it already has.
Greg Kroah-Hartman said that "competition is good"; he was strongly infavor of the effort. Desaulniers was glad to hear that as he and otherswere worried that the tight coupling with GCC was being protected by thekernel developers. Kroah-Hartman said that there have been other compilersbuilding the kernel along the way. Behan Webster also pointed to all ofthe new features that have come about in GCC over the past five years as aresult of the competition with LLVM. Kroah-Hartman said that he wishedthere was a competitor to the Linux kernel.
Hackmann related the state of the upstream kernel: "we are very close tohaving a kernel that can be built with Clang". It does require using a recent Clangthat has some fixes, but the x86_64 and ARM64 kernels can be built,though each architecture has one out-of-tree patch that needs to be appliedto do so. There is also one Android-specific Kbuild change that is needed,but only if the Android open-source project (AOSP) pre-built toolchain isbeing used.
As announced on thekernel mailing list, there are patches available for the 4.4 and 4.9kernels. There are also experimental branches of the Android kernelsfor 4.4 and 4.9 available from AOSP. More details can be found in the slides[PDF]. Those branches had just been pushed a few days earlier,Hackmann said, and theHiKey boards were ableto build and boot that code shortly thereafter.
There have been LLVM bugs found in the process, though most of them havebeen fixed at this point, Desaulniers said. The initial work was done with LLVM 4.0, butthey have since updated to 5.0 and are also building with the current LLVMdevelopment tree (which will become 6.0). You can probably build thekernel with 4.0, he said, but it will be much slower than building with 5.0or later.
There are still some outstanding issues. Variable-length arrays asnon-terminal fields in structures are not supported by Clang, there is aGNU C extension for inline functions that is not supported, and the LLVMassembler cannot be used to build the kernel. Hackmann noted that the GNU assembler istoo liberal in what it accepts.
This work has shown that the FUD surrounding using a new toolchain for thekernel is unfounded, Desaulniers said. It is working now, but there are afew asterisks. Clang, the front end, can compile the kernel, butthe assembler and the linker from GNU Binutils are needed to complete thebuild process.
Next up is figuring out how to do automated testing of LLVM and thekernel. Currently, the team is working with two specific LTS kernelbranches and using specific LLVM versions. So he can't quite say thatClang will build any kernel, since there are so many differentconfiguration options. A bot to check whether kernel patches will fail tobuild under Clang is in the works as well. An audience member noted thatkernelci.org is looking at adding othercompilers to its build-and-boot testing.
Hackmann and Desaulniers encouraged others to try building using Clang.All it takes is a simple "make CC=clang" on a properlyequipped system. We are, it seems, quite close to having a two-compilerworld for the Linux kernel.
[I would like to thank LWN's travel sponsor, The Linux Foundation, forassistance in traveling to Los Angeles for LPC.]