How many ways are there to configure the Linux kernel?
There are a large number of ways to configure the 6.16 Linux kernel. It has 32,468 different configuration options on x86_64, and a comparable number for other platforms. Exploring the ways the kernel can be configured is sufficiently difficult that it requires specialized tools. These show the number of possible configurations that options can be combined in has 6,550 digits. How has that number changed over the history of the kernel, and what does it mean for testing?
Analyzing Kconfigs
With so many configuration options to consider, it can be difficult to even figure out which options are redundant, unused, or otherwise amenable to simplification. For our problem of counting the number of ways to configure the kernel, we need to consider the relationships between different configuration options as well. Happily, there is some software to manage the growing complexity.
The kernel uses a custom configuration language, Kconfig, to describe how the build can be customized. While Kconfig has spread to a handful of related projects, it remains a mostly home-grown, Linux-specific system. Paul Gazzillo, along with a handful of other contributors, wrote kmax, a GPL-2.0-licensed software suite for analyzing Kconfig declarations. That project includes tools to determine which configuration options need to be enabled in order to test a patch, to find configuration bugs, and to export Kconfig declarations in a language that can be understood by more tools. It has been integrated into the Intel 0-day test suite, finding a number of bugs in the kernel's configuration in the process.
One of the project's programs, klocalizer, can output the set of relationships between different options in SMT-LIB 2 or DIMACS format — the standard input formats for modern SAT solvers. This makes it much simpler to write custom ad-hoc analysis of the kernel's configuration options. For example, one could use Z3's Python bindings to load the output from klocalizer and explore how options relate to one another. A SAT solver such as Z3 produces (when possible) a single solution to a set of constraints — in this case, a single valid way to configure the kernel.
Calculating the number of valid solutions is a related but much harder problem called #SAT. While there are a number of approximation techniques that can be used to solve #SAT probabilistically, in this case the constraints in the kernel's configuration are simple enough for a precise counter called ganak to solve the problem exactly.
The numbers
So, with these tools in hand, one can analyze how many ways to configure the kernel there really are. The growth in the number of configuration parameters in the kernel has been roughly linear, leading to exponential growth in the number of ways the kernel can be configured. The following chart shows two trend lines: the logarithm of the actual number of ways to configure the kernel in blue, and the number of configuration options (scaled appropriately to sit on the same axis; see below) in red. The red line represents the theoretical maximum that the blue line could obtain. Plotting 6,000 digit numbers directly proved to be too much for both LibreOffice Calc and Google Sheets, hence the logarithms.
There are a few things that can be immediately seen from this chart. For example, there are more than twice as many configuration options available in version 6.16 compared to version 2.6.37 (the earliest version that kmax can understand). In some sense, this is not surprising — the Linux kernel keeps adding support for new features and new hardware; the number of configuration options is just growing along with the code. But there is a subtler trend that is arguably more important: configuration options are becoming more interdependent over time.
If every configuration option could be enabled independently of every other configuration option, there would be 2N ways to configure the kernel given N configuration options. The red line is the number of configuration options scaled by ln(2)⁄ln(10) — that is, it represents the logarithm of 2N. That way the two lines are directly comparable. The actual number of ways to configure the kernel sits below this theoretical maximum because configuration options can depend on or exclude each other. This chart shows the difference between the red and blue lines:
Note that this result is still on a logarithmic scale, so this value is the logarithm of the number of ways to configure the kernel that are ruled out by dependencies between different options. As the difference grows, it shows that more configuration options are interacting with each other. This is not, in itself, a problem — but those interactions are likely associated with interdependent pieces of code. The number of places where kernel code depends on multiple related configuration options is growing.
With the number of possible configurations, it is certain that not all of them have been tested individually. If the configuration options were independent, testing a kernel with everything enabled and one with everything disabled would theoretically suffice. Many options, especially ones that are specific to a particular hardware driver, are genuinely independent. But when options can influence each other and interact (even outside of Kconfig interactions), the number of scenarios that need to be tested grows quickly.
Linux's testing is largely distributed — it is done by individual patch contributors and reviewers, several different automatic unit testing and fuzzing robots, distribution QA teams, companies, and normal Linux users. With the decentralized nature of open source, and the large number of possible configurations, it's hard to tell exactly how many configurations are being covered. The kernels used by popular distributions are well-tested, but enabling other configuration options is a fast route into uncharted territory. The Linux kernel isn't going to stop getting new features — so the only real solution to the problem is to try to simplify existing configuration options when possible. The recent work to remove special uniprocessor support from the scheduler is an example of that.
Ultimately, among the many challenges of kernel development, this is not the most urgent or important. It is worth keeping an eye on, though, for the sake of the sanity of anyone trying to configure a kernel, if nothing else.
Posted Sep 10, 2025 17:14 UTC (Wed)
by nickodell (subscriber, #125165)
[Link] (7 responses)
Posted Sep 10, 2025 17:19 UTC (Wed)
by daroc (editor, #160859)
[Link] (5 responses)
Posted Sep 10, 2025 19:46 UTC (Wed)
by vegard (subscriber, #52330)
[Link] (3 responses)
There are also a bunch of configs that will rely on the host configuration, e.g. choosing lz4 compression for vmlinux when no lz4 utilities are installed on the host, and which will fail to build successfully for those reasons. Many more combinations of options will fail to boot for various reasons that are not necessarily bugs. There are things you can do to minimize this kind of thing, either you can apply some patches so that those combinations never appear or you can run scripts/config after randconfig to enable/disable specific options as needed.
Posted Sep 11, 2025 12:44 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
But you need the existing mode too as an option that requires N things will then have 1/2^N chance of being allowed on. Though…perhaps one way would be to:
- make a bag of all options
Maybe just having the bag have pairs of (option, value) each individually weighted would be better?
Posted Sep 11, 2025 12:59 UTC (Thu)
by daroc (editor, #160859)
[Link] (1 responses)
There's actually a tool by the same people who made ganak that will randomly sample from a constrained space of options with negligible bias. So you might be able to use kmax to convert the Kconfigs to a SAT problem, use that tool (csb) to sample a random model, and then convert that back to a .config.
Posted Sep 11, 2025 22:43 UTC (Thu)
by quotemstr (subscriber, #45331)
[Link]
It solves the problem of generating useful "pairwise testing" configurations by picking from large numbers of options with constraints. Surprisingly non-trivial yet very useful.
Posted Sep 11, 2025 17:59 UTC (Thu)
by make (subscriber, #62794)
[Link]
Posted Sep 18, 2025 17:56 UTC (Thu)
by yanjun.zhu (guest, #173290)
[Link]
Posted Sep 10, 2025 17:36 UTC (Wed)
by willy (subscriber, #9762)
[Link] (2 responses)
To say nothing of various syscalls you can make.
The search space for "is the kernel correct" (for literally any definition of the word "correct") is mind-numbingly big. And that shouldn't surprise us; BB-6 is unknown, but larger than 10↑↑15 (for those not familiar, that's 10^ (10^ (10 ^ ...)) to a height of 15). The kernel is many orders of magnitude larger than BB-6.
https://en.wikipedia.org/wiki/Busy_beaver
Posted Sep 10, 2025 19:11 UTC (Wed)
by tux3 (subscriber, #101245)
[Link]
Posted Sep 11, 2025 8:06 UTC (Thu)
by DemiMarie (subscriber, #164188)
[Link]
Posted Sep 10, 2025 22:38 UTC (Wed)
by jrtc27 (subscriber, #107748)
[Link]
I am disappointed to learn that Daroc is a quitter when it comes to making their point properly: https://xkcd.com/1162/ :)
Posted Sep 11, 2025 6:03 UTC (Thu)
by adobriyan (subscriber, #30858)
[Link]
Posted Sep 11, 2025 7:23 UTC (Thu)
by wsy (subscriber, #121706)
[Link] (2 responses)
Posted Sep 11, 2025 7:32 UTC (Thu)
by Sesse (subscriber, #53779)
[Link] (1 responses)
Posted Sep 11, 2025 9:59 UTC (Thu)
by pbonzini (subscriber, #60935)
[Link]
Posted Sep 15, 2025 23:12 UTC (Mon)
by calvinowens (subscriber, #100757)
[Link]
It's an interesting consequence of the seeded approach, but maybe it doesn't matter, I suppose 2^32 is enough that people testing randconfigs in the wild aren't generating the same randconfigs.
What percentage of valid configuration options successfully build?
It has 32,468 different configuration options on x86_64
Darn it, so close to a pleasing round number. Could someone add 300 more configuration options real quick?
With the number of possible configurations, it is certain that not all of them have been tested individually. If the configuration options were independent, testing a kernel with everything enabled and one with everything disabled would theoretically suffice.
I wonder what percentage of these configurations successfully compile. Of course, it wouldn't be possible to try all configurations, but it seems like it would be possible to randomly sample 100 legal configurations from the set of possible configurations, and test if they successfully compile.
My intuition for this is that maybe 20% of randomly sampled configurations would fail to compile, but I'm curious what others think.
What percentage of valid configuration options successfully build?
What percentage of valid configuration options successfully build?
What percentage of valid configuration options successfully build?
- weight each option based on the number of other options that either require it or it requires
- pull an option from the bag at random
- choose a value for it
- remove options from the bag that are forced to be a value based on the selected value
- repeat until bag is empty
What percentage of valid configuration options successfully build?
What percentage of valid configuration options successfully build?
What percentage of valid configuration options successfully build?
What percentage of valid configuration options successfully build?
And then there's runtime options ...
And then there's runtime options ...
Formal verification techniques
Log Scale
fake options
Why so many 0s at the end?
Why so many 0s at the end?
Why so many 0s at the end?
32-bit KCONFIG_SEED
