The Ninja build tool
Ninja is a build tool, similar in spirit to make, that was born from the Chrome browser project in 2011. It is unique among build systems in that it was explicitly designed to be used as an "assembly language" by higher-level build tools. Beyond that, Ninja is fast. It is used by several large open-source projects and it has changed the way I approach build systems.
Ninja lacks many of the features that make users are familiar with: no conditionals, no string-manipulation functions, and no patterns or wildcards. Instead, you are supposed to put this logic into a separate program that generates Ninja build files; you aren't supposed to write Ninja build files by hand, though they are still quite readable. Some people write this generator program from scratch in a high-level language like Python, and they usually call it configure. Others use a build system such as CMake, GYP, or Meson, all of which can generate Ninja files based on rules written in their own custom language.
Despite its minimalist philosophy, Ninja does have a few features that make lacks: a progress report on its standard output, support for filenames with spaces, a "pools" mechanism to limit the parallelism of certain rules, and the ability to "include" other Ninja files without polluting the namespace of the including file. Ninja also has built-in support for some features that you could implement in a Makefile by using various techniques, but it's tedious to do so. I'll cover some of these in the article.
Rules and targets
A Ninja file to compile and link a C program from two source files might look like what's below. I have highlighted the targets (that is, the files that are created by the build process) in bold:
rule cc command = gcc -c -o $out $in description = CC $out rule link command = gcc -o $out $in description = LINK $out build source1.o: cc source1.c build source2.o: cc source2.c build myprogram: link source1.o source2.o
The last three lines are build statements. For example myprogram is a target, source1.o and source2.o are its inputs or dependencies, and link is the rule (defined a few lines earlier) that will build the target from the inputs. In the rule's command line, $out is replaced by the target and $in is replaced by the list of inputs, with the appropriate shell quoting. Ninja also supports implicit dependencies, implicit outputs, and order-only dependencies. Providing a description for a rule turns on automake-style silent rule output.
Auto-generated dependencies
In Makefiles, the standard technique for detecting changes to implicit dependencies (such as C header files) is to generate a small Makefile snippet that specifies the dependencies and to include this generated file into your main Makefile.
The equivalent technique in a Ninja file looks like this:
rule cc command = gcc -c -o $out $in -MMD -MF $out.d depfile = $out.d deps = gcc
Here, -MMD tells GCC to output the list of included files and -MF says where to write it. Normal compilation happens too; the dependency file is generated as a side-effect of the compilation process. The depfile statement tells Ninja where to read those additional dependencies from. The first time you run Ninja, it will build the source1.o file because it is out of date compared to its explicit source1.c dependency. The next time Ninja is run, it will check to see if source1.c has changed or if any of the header files listed in depfile have changed. To support this technique, Ninja understands just the subset of Makefile syntax that is actually generated by C pre-processors. Ninja also supports a similar feature of the Microsoft Visual C++ compiler, which prints specially-formatted lines to stderr when given the /showIncludes flag.
Keeping track of build command lines
Ninja remembers the command line that was used to build each target. If the command line changes (probably because the build.ninja file itself changed), then Ninja will rebuild the target, even if the dependencies didn't change. Ninja tracks this information across invocations by storing a hash of the build command for each target in a .ninja_log file in the top-level build directory.
You can do this with Makefiles too — the technique is described here and it's used by the Makefiles for both Git and Linux. But it's tedious, error-prone, and slow.
Generating (and re-generating) the build file
Ninja doesn't support loops, wildcards, or patterns; you're supposed to generate the Ninja build file from another program. A simple configure script written in Python might look like this:
f = open("build.ninja", "w") sources = ["source1.c", "source2.c"] for source in sources: f.write("build {outputs}: {rule} {inputs}\n".format( outputs=source.replace(".c", ".o"), rule="cc", inputs=source)) # etc
If you want to support environment variables like $CFLAGS, it is best practice to read these variables in the configure script, and bake the values into the Ninja file. This makes it easier to maintain multiple build folders, such as a debug and a production build. The autotools behave this way.
Now if you edit your configure script to add another source file, source3.c, you'll want to ensure that build.ninja is re-generated. You can achieve this with another Ninja rule:
rule configure command = ./configure generator = 1 build build.ninja: configure
Thus, if build.ninja is out of date (older than configure), Ninja will run configure to re-create build.ninja, before it does anything else. The generator statement is necessary to exclude the target (the build.ninja file) from being removed by Ninja's auto-generated clean command. In practice, you would also want to remember any parameters originally given to configure (such as $CFLAGS), and bake them into the rule that re-runs configure.
If you're using a generator program like CMake, the principle is the same. The build.ninja file generated by CMake will arrange for itself to be re-generated if you edit CMake's build description file CMakeLists.txt.
Performance
![[No-op performance]](https://static.lwn.net/images/2016/ninja-no-op-build.png)
Ninja's original motivation was speed. A no-op build of Chrome (where all the targets are already up to date) reportedly took 10 seconds with make, but less than a second with Ninja.
According to my own benchmarks, Ninja's speed difference is only really significant for very large projects (on Linux, at least). However I didn't try to implement, in make, my own version of Ninja's "rebuild if command-line changes", which would presumably slow the performance of make further.
Ninja generators and users
CMake is the most widely-used build system with Ninja support. CMake has always been a "meta build system" in that it generates build files for other build systems: various varieties of Makefiles, XCode project files for Mac, or Visual Studio project files for Windows. Since v2.8.8 in 2012, it can generate Ninja files as well.
GYP ("Generate Your Projects") is the build system used by Chromium and related projects such as the V8 JavaScript engine. As far as I know, it doesn't have much adoption elsewhere, and I don't know much about it.
Meson is a fairly recent build system that seems to be gaining traction. Unlike CMake, which seems to have come from the Windows world, Meson wants to provide first-class support for the Linux open-source ecosystem such as pkg-config, Gnome/GLib, etc., and the maintainers are happy to merge patches to support these types of projects. Maybe one day this will be able to replace autotools. (For many projects at least.) The GStreamer project recently merged "experimental" support for building with Meson — see this talk [video] at last month's GStreamer conference.
A few other Ninja generators are listed on the Ninja wiki, but it's hard to tell which of those are toy projects and which are suitable for large or complex projects.
Large projects that use Ninja include Chromium, of course; LLVM since 2012, via CMake; the Android Open Source Project, since late 2015, by parsing and translating GNU Makefiles; and GStreamer's experimental Meson-based build system mentioned above. Ninja is available in major Linux distributions — after all, it's needed to build Chromium. It's usually packaged as "ninja-build".
The Ninja community
Ninja was originally written by Evan Martin, who was working on the Chrome
browser, in 2011. Martin handed over maintainership in April 2014 to Nico Weber
(also on the Chrome team)
because
"I
actually haven't myself used Ninja in something like two years
",
he said,
having left the Chrome team. Even so, Martin is still active on the mailing list
and in the Git logs.
In the last two years, the release cadence has slowed down to one release every six to ten months. The feature set is pretty stable; these days the releases contain mostly bug fixes, though some useful new features do occasionally make their way in.
The latest major release (1.7.1 in April 2016) had 160 commits by 28 different contributors. 57% of the commits were from Weber and Martin, 14% from other Google employees, 5% from Kitware (the company behind CMake), 7% from a handful of other companies (SAP, Bloomberg, SciTools), and the remaining 17% of commits from 16 contributors whose company affiliation isn't obvious from the Git logs. The mailing list is fairly quiet, with a handful of threads per month, but it does include a good amount of feature discussion, not just support queries.
Some fairly obvious bugs are still open after four years: for example, Ninja doesn't support sub-second timestamp resolution. Other convenient features never get implemented (such as deleting output files if the build failed like GNU Make's .DELETE_ON_ERROR) partly because it's easy to implement workarounds in your Ninja-file generator. Keeping the Ninja codebase small and focused seems to be the driving philosophy. All in all the project seems healthy and mature. Ninja is written in C++ (12,000 lines of it, of which 40% are tests). It is released under the Apache 2.0 license.
Ninja's big idea
For me, Ninja's biggest contribution is to popularize the concept of generating build files from a real programming language. Many projects will find CMake or Meson to be a good fit, but when the needs are more complex, it can be surprisingly simple and elegant to use a real programming language like Python atop a dumb build system like Ninja or even a subset of make.
At $DAY_JOB, we build what is essentially a custom Linux distribution for appliances, with services packaged in Docker images. The build system was getting hard to debug and we were losing confidence in the correctness of incremental builds. We decided to try Ninja. Step one was to get rid of patterns and conditionals in the Makefiles, and write a Python script to generate the Makefiles. Step two was to output Ninja instead of make format. Before even getting to step two, however, we had already gained significant improvements in understandability, traceability, and debuggability.
Of course generating Makefiles is not a new idea — the autotools and CMake have been doing it for decades. But Ninja has taught me just how easy and flexible this approach is. For more information, Ninja's manual is a short and pleasant read. The free book "The Performance of Open Source Applications" has a chapter on Ninja that covers the original motivations for Ninja and some implementation details.
Index entries for this article | |
---|---|
GuestArticles | Rothlisberger, David |
Posted Nov 17, 2016 5:21 UTC (Thu)
by roc (subscriber, #30627)
[Link] (8 responses)
Posted Nov 17, 2016 20:09 UTC (Thu)
by pbonzini (subscriber, #60935)
[Link] (7 responses)
I need to write a blog post about it someday...
Posted Nov 18, 2016 1:41 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Posted Nov 18, 2016 18:07 UTC (Fri)
by madscientist (subscriber, #16861)
[Link] (1 responses)
In the sense that there are many different instances of make invoked, it's absolutely recursive (see my pstree output below) with all the overhead involved with that. It's likely that some of the worst problems with recursive makefiles that are described in Miller's paper on the subject are not present in this build system: I haven't investigated but I assume that the "master" makefile has some sense of which high-level targets depend on others.
I would say there's little doubt that the makefiles generated by CMake are not getting every last erg of performance out of make (and especially not GNU make, since they're standard POSIX makefiles).
Here's a pstree of a CMake-generated build from a medium-sized project (but one with lots of different targets):
Posted Nov 20, 2016 14:14 UTC (Sun)
by mathstuf (subscriber, #69389)
[Link]
Posted Nov 18, 2016 5:29 UTC (Fri)
by wahern (guest, #37304)
[Link] (3 responses)
Posted Nov 18, 2016 11:20 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (1 responses)
Posted Nov 18, 2016 22:58 UTC (Fri)
by wahern (guest, #37304)
[Link]
Posted Nov 18, 2016 11:25 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link]
Posted Nov 17, 2016 7:16 UTC (Thu)
by brouhaha (subscriber, #1698)
[Link] (4 responses)
Posted Nov 17, 2016 12:21 UTC (Thu)
by fsateler (subscriber, #65497)
[Link] (1 responses)
Posted Nov 17, 2016 13:13 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link]
Posted Nov 17, 2016 13:11 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (1 responses)
Posted Nov 30, 2016 19:20 UTC (Wed)
by nix (subscriber, #2304)
[Link]
Posted Nov 17, 2016 7:55 UTC (Thu)
by halla (subscriber, #14185)
[Link] (4 responses)
CMake doesn't "seem to come from the Windows world". As a quick look on wikipedia (https://en.wikipedia.org/wiki/CMake#History) shows, CMake started out to be cross-platform, instead . And that's why I hope that the current trend where more and more cross-platform libraries get their cmake build system, and more and more applications get their cmake build system will accelerate. Cross-platform is what counts for me these days, especially reliable cross-platform finding of dependencies.
Posted Nov 17, 2016 10:15 UTC (Thu)
by drothlis (guest, #89727)
[Link] (3 responses)
Ninja itself has good Windows support: It uses "response" files to work around command-line length limitations, it understands the dependency format generated by the Microsoft compiler, it has had performance optimisations motivated by operations that are particularly slow on Windows, and it has zero dependencies so it's easy to install.
I'm not sure whether the "good Windows support" extends to CMake+Ninja, or if CMake users on Windows still prefer CMake with the Visual Studio back end. Anybody? Presumably if you're using CMake+Makefiles then CMake+Ninja will just work.
Posted Nov 17, 2016 10:47 UTC (Thu)
by halla (subscriber, #14185)
[Link] (1 responses)
I used to use CPack at my day job, some years ago, and it worked fine creating distribution source tarballs, windows setup.exe using nsis and OSX app bundles in a disk image.
I used to use ninja with cmake on Windows, and on Linux, but on the whole, while it worked fine, it just didn't add much value to my workflow -- probably the structure of my project, structurally inherited from the autotools days, with 1,100 kloc and about 150 library targets isn't that suited to it. Or it was just my habit.
Posted Nov 17, 2016 11:33 UTC (Thu)
by drothlis (guest, #89727)
[Link]
On the other hand if I needed to create a Windows & OS X installer I wouldn't dream of doing it by hand, I'd use CMake/CPack.
The main personal discovery that I wanted to share was that when you have very custom needs[1] generating dumb build files (whether they're Ninja files or explicit Makefiles) from Python is way better than using `make` alone, or Python alone, or a higher-level build system that's designed for more conventional needs.
[1]: My own use case is more like an integrator / distro packager, where I'm building an Operating System image from many components (I'm also the upstream of some of those components, other components may be using CMake or some other build system).
Posted Nov 17, 2016 13:11 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link]
Yes, Ninja is way faster than Visual Studio to build and to generate. Building because Ninja does rule-level parallelization whereas msbuild does target-level parallelization. The generate step is faster because Ninja builds for a single build type (i.e., Debug vs. Release) at a time while Visual Studio supports multiple configurations from a single generated .sln file. For projects with many generator expressions (which if using the new `target_*` commands, is a lot), that multiplies that time out.
> Presumably if you're using CMake+Makefiles then CMake+Ninja will just work.
There are some things supported only in one and not the other, but they're rarely used (and those bits are documented as being generator-specific).
Posted Nov 17, 2016 8:52 UTC (Thu)
by drothlis (guest, #89727)
[Link]
https://chromium.googlesource.com/chromium/src/+/master/t...
Thanks to Nico Weber for pointing this out.
Posted Nov 17, 2016 11:20 UTC (Thu)
by lkundrak (subscriber, #43452)
[Link]
Posted Nov 17, 2016 18:22 UTC (Thu)
by rabinv (guest, #99886)
[Link] (2 responses)
Posted Nov 17, 2016 20:11 UTC (Thu)
by pbonzini (subscriber, #60935)
[Link] (1 responses)
Posted Nov 17, 2016 20:45 UTC (Thu)
by rabinv (guest, #99886)
[Link]
Posted Nov 17, 2016 21:58 UTC (Thu)
by thoughtpolice (subscriber, #87455)
[Link] (4 responses)
One of the reasons I bring it up is because, Shake has support for reading and executing `.ninja` files! Originally, this feature was only used to benchmark Shake against Ninja to see how it faired (spoiler alert: it's pretty much just as fast). Shake also has a lot of other features, even when you only use it for Ninja; for example it can generate profiling reports of your build system, so you can see what objects/rules took the most time, etc. I actually use LLVM's CMake build system to generate .ninja files, then use Shake to run the actual build. It's useful sometimes when I occasionally want to see what takes up the most time while compiling[1]. Some people here might like that. I believe the 'lint' mode in Shake can also detect classes of errors inside Ninja files like dependency violations, so that's useful too.
The actual Shake build system itself, however, is almost an entirely different beast, mostly because it's more like a programming language library you create build systems from, rather than a DSL for a specific tool: more like e.g. Waf than CMake, so to speak. So on top of things like parallelism pools like Ninja, extending that even further beyond, to incorporate features like distributed object result caching (a la Bazel/Blaze inside Google) is quite feasible and doable. It also has extremely powerful dependency tracking features; e.g. I can have a config file of key-value pairs, and Shake tracks changes all the way down to individual variable assignments themselves, not the actual mtime or whatever of the file. You can express a dependency on the output of `cc --version`, so if the user does `export CC=clang-4.0; ./rebuild`, only rules that needed the C compiler get rerun, etc. I've been using lots of these features in a small Verilog processor I've been working on. I can just run the timing analysis tool on my design, it generates a resulting report, run a parser to parse the report inside the build system itself, and the build can fail if the constraints are violated, with a pretty error-report, breakdown, etc in the terminal window. If I extended it, I could even get the build to give me longest paths, etc out of the resulting report.
It's almost life-changing when your build system is this powerful -- things that you'd previously express as bizarre shell scripts or "shell out" to other programs to accomplish, you can just write directly in the build system itself. This, in effect, completely changes the dynamics around what your build system can even do and what its responsibilities are. I find it surprisingly simple and freeing when everything can be done "In one place", so to speak, and I'm not as worried about taking on complex features that will end in a million tears down the road.
That said, Shake is on the extreme end of "I need a really powerful build system". It's only going to pay off with serious investment and need for the features. We're going to use it in the next version of the Glasgow Haskell Compiler, but our build system is an insanely complex non-recursive Make fiasco with all kinds of impressive tricks inside of it that have destroyed its maintainability over time (in an ironic twist of fate -- since most of these tricks were intended to make the build system more reliable and less brittle, but only came at a large cost. Don't look at how the sausage is made, etc etc.)
If you can, these days I normally suggest people just use something like Make, or CMake+Ninja. There are some fundamental concepts they might lack direct analogs of in comparison to Shake or whatever, but they're pretty good and most software doesn't *really* need an exceptionally complex build system. Honestly, I would probably just like Make a lot more if the terse syntax didn't get utterly ridiculous in some cases like interpolating inside macros, escaping rules, etc, and I'd like CMake more if it WAS_NOT_SO_VERBOSE.
[1] related: LLVM really, really needs a way to leverage Ninja pools for its link rules, because if you have too many cores, you'll eat all your RAM from 10x concurrent `ld` processes. I really hate that, because Ninja loves to automatically use up every core I have by default, even if it's 48+ of them :)
Posted Nov 17, 2016 23:06 UTC (Thu)
by karkhaz (subscriber, #99844)
[Link] (3 responses)
Regarding your comment about linking. It seems that Daniel wants llbuild to use Clang _as a library_ rather than invoking it as a subprocess. More generally, he thinks that if build systems in the future were able to communicate with the build commands (rather than just spawning them and letting them do their thing) we would be able to get much more highly optimised builds...things like llbuild having its own scheduler so that it could run I/O- and CPU-intensive tasks together. May be worth listening to the talk once a video is posted. Exciting times!
Posted Nov 20, 2016 0:52 UTC (Sun)
by thoughtpolice (subscriber, #87455)
[Link] (2 responses)
At one point, someone had mentioned a similar thing for Shake and the GHC build system rewrite: why not use the compiler APIs directly in the build system to compile everything ourselves? I think it's a valid approach, though API instability makes things a little more complex, perhaps. We initially just wanted to port the existing system, which already has taken long enough! I do think it could improve a lot of things, though, at a first glance. The linker case is a good one.
I'll check the LLVM Dev video when I get a chance, thanks for pointing it out!
Posted Nov 21, 2016 10:44 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
But forking a brand new compiler for every file is even less elegant. Perhaps there could be a middle ground - why not create something like a "compilation server"? The simplest version can just be a simple read-eval loop that reads arguments from stdin and launches a compiler in a thread, multiplexing its output into stdout.
This can easily be gradually adapted for over compilers (gcc, icc, msvc) as it can gracefully degrade to simply spawning new processes.
Posted Nov 30, 2016 19:56 UTC (Wed)
by nix (subscriber, #2304)
[Link]
Posted Nov 18, 2016 5:23 UTC (Fri)
by wahern (guest, #37304)
[Link]
http://www.kaizou.org/2016/09/build-benchmark-large-c-pro...
One of the takeaways is that CMake does a poor job at producing efficient ninja files. If you're using CMake to generate ninja, it's like getting the worst of both worlds.
Another takeaway, I think, is that non-recursive Make is really efficient. Theoretically there's little to distinguish a non-recursive Make build from a ninja build. The basic syntax is very similar. The real bottleneck is GNU Make's implementation, which is lumbering after decades of feature accretion and hacks. (OTOH, it's also much more flexible. ninja's auto-dependency generation only works for C and C++ using GCC syntax, whereas the GNU Make solution is totally generic.)
Regarding GNU Make spending so much time including auto-generated header dependencies (mentioned in http://david.rothlis.net/ninja-benchmark/), I bet that could be addressed by generating a single include file per directory instead of per source file. As you showed in your benchmarks even ninja spends most of its time parsing, and it's parser is leaner and simpler than GNU Make's.
Posted Nov 18, 2016 17:24 UTC (Fri)
by jhhaller (guest, #56103)
[Link]
Has anyone seen this optimization done?
Posted Nov 21, 2016 11:33 UTC (Mon)
by gerv (guest, #3376)
[Link]
The Ninja build tool
The Ninja build tool
The Ninja build tool
It depends on what you mean by "recursive makefiles". I haven't studied them in detail but it appears that the CMake-generated makefiles have one makefile per high-level target (program, library, etc.), where that makefile contains all the rules needed to build that target (these can be lengthy). The "master" makefile runs one make instance for each of these targets.
CMake and recursive makefiles
$ pstree -a 6453
bash
└─make -j8
└─make -f CMakeFiles/Makefile2 all
├─make -f Dir1/CMakeFiles/Dir1.dir/build.make...
│ └─sh -c...
│ └─cmake -E cmake_dependsUnix M
├─make -f Dir1/CMakeFiles/Dir1Static.dir/build.make...
│ └─sh -c...
│ └─ccache...
│ └─x86_64-generic-
│ └─cc1plus
├─make -f Dir3/CMakeFiles/Dir3IFace.dir/build.make...
│ └─sh -c...
│ └─ccache...
│ └─x86_64-generic-
│ └─cc1plus
├─make -f Dir4/CMakeFiles/Dir4.dir/build.make...
│ └─sh -c...
│ └─ccache...
│ └─x86_64-generic-
│ └─cc1plus
├─make -f Dir5/CMakeFiles/Dir5.dir/build.make...
│ └─sh -c...
│ └─ccache...
│ └─x86_64-generic-
│ └─cc1plus
├─make -f Dir6/CMakeFiles/Dir6.dir/build.make...
│ └─sh -c...
│ └─ccache...
│ └─x86_64-generic-
│ └─cc1plus
├─make -f Dir3/CMakeFiles/Dir3Test.dir/build.make...
│ └─sh -c...
│ └─ccache...
│ └─x86_64-generic-
│ └─cc1plus
└─make -f Dir7/CMakeFiles/crashtest.dir/build.make...
└─sh -c...
└─cmake -E cmake_link_scriptCM
└─x86_64-generic-
└─collect2
└─ld
CMake and recursive makefiles
The Ninja build tool
The Ninja build tool
The Ninja build tool
The Ninja build tool
If you want "a real programming language like Python", you can use SCons which is fully in Python, and doesn't have to invoke a different build tool with a different syntax.
The Ninja build tool
The Ninja build tool
The Ninja build tool
The Ninja build tool
The Ninja build tool
The Ninja build tool
The Ninja build tool
The Ninja build tool
The Ninja build tool
The Ninja build tool
GYP -> GN
The Ninja build tool
Kbuild + ninja
Kbuild + ninja
Kbuild + ninja
The Ninja build tool
The Ninja build tool
The Ninja build tool
The Ninja build tool
The Ninja build tool
The Ninja build tool
Build time optimization
Presumably they called it "ninja" because all of the names formed from prefixing the word "make" with a single letter were taken?
The Ninja build tool
Gerv