Nftables reaches 1.0

By Jonathan Corbet
August 27, 2021

The Linux kernel is a fast-moving project, but change can still be surprisingly slow to come at times. The nftables project to replace the kernel's packet-filtering subsystem has its origins in 2008, but is still not being used by most (or perhaps even many) production firewalls. The transition may be getting closer, though, as highlighted by the release of nftables 1.0.0 on August 19.

The first public nftables release was made by Patrick McHardy in early 2009. At that time, the kernel had a capable packet-filtering subsystem in the form of iptables, of course, that was in widespread use, but there were a number of problems driving a change. These include the fact that the kernel had (and still has) more than one packet-filtering mechanism: there is one for IPv4, another for IPv6, yet another for ARP, and so on. Each of those subsystems is mostly independent, with a lot of duplicated code. Beyond that, iptables contains an excessive amount of built-in protocol knowledge and suffers from a difficult API that, among other things, makes it impossible to update a single rule without replacing the entire set.

The core idea behind nftables was to throw away all of that protocol-aware machinery and replace it with a simple virtual machine that could be programmed from user space. Administrators would still write rules referring to specific packet-header fields and such, but user-space tooling would translate those rules into low-level fetch and compare operations, then load the result into the kernel. That resulted in a smaller packet-filtering engine that was also far more flexible; it also had the potential to perform better. It looked like a win, overall, once the minor problem of transitioning a vast number of users had been overcome.

Nftables made a bit of a splash when it was launched, but then bogged down and disappeared from view, perhaps because McHardy decided he had more interesting opportunities to pursue in courtrooms. In 2013, though, Pablo Neira Ayuso restarted the project with the idea of getting the code merged into the mainline as soon as possible. That part succeeded; nftables found its way into the 3.13 kernel release at the beginning of 2014.

The work since then has been a hard slog of filling in the gaps and making nftables sufficiently appealing that users would want to make the transition. The language used to write filtering rules has gained a long list of features for stateful tracking, address mapping, efficient handling of address intervals and large rule chains, and support for numerous protocols. There was also documentation to write, of course; the nftables wiki has a lot of information about how it all works.

There is, of course, one other significant impediment to transitioning away from iptables: the vast number of deployed, working firewalls using the latter. In many cases, rewriting the firewall rules may be the best course of action because many complex filtering setups can be expressed much more efficiently in the new scheme. But, for administrators who just want their painfully developed firewall to keep working, the benefits of nftables may be less appealing than one might expect. The nftables developers have developed a set of scripts to translate iptables firewalls into the nftables equivalent, which should help, but it is still a big jump.

In some cases, users may eventually make that jump without even noticing, though. Linux distributions have carried support for nftables for some time now, and work is being done to port tools like Red Hat's firewalld to nftables. In cases like this, users may have never seen the iptables rules in the first place and, with luck, will not notice that the underlying mechanism has changed.

When will that change happen? It is still somewhat hard to say. The 2018 Netfilter Workshop decreed that iptables is "a legacy tool" whose days are numbered. Debian switched to nftables by default in the 2019 Debian 10 "buster" release, though Ubuntu didn't follow until the 21.04 release. While almost all distributions ship nftables, many of them have yet to make the switch to use it by default.

The release of nftables 1.0.0 can be seen as a signal that it is time for the laggards to get more serious about making the switch. While it is hard to imagine iptables support being removed anytime soon, it's rather easier to foresee that enthusiasm for maintaining it will continue to wane. New features will show up in nftables instead, and users will eventually need to migrate over to take advantage of them. It only took 13 years, but this transition finally appears to be heading into its final stage.

There is, however, one other interesting question. In 2018, the BPF developers announced bpfilter, a packet-filtering mechanism that runs on the BPF virtual machine. The announcement drew some attention at the time; BPF had (and has) a lot of momentum, and a lot of work has been done to optimize the virtual machine and make it safe to use. Arguably, it makes sense to use that rather than maintain yet another virtual machine just for packet filtering. That would allow the removal of a bunch of code and the focusing of maintenance effort on BPF.

The bpfilter code was merged for the 4.18 kernel release; it also brought in a "user-mode blobs" mechanism that was intended to facilitate the translation of firewall rules to the new machine. Since then, however, development on this code has come to a halt; there have been exactly two (trivial) commits to the code in net/bpfilter in 2021. The removal of this code was discussed in June 2020 but it survived at that time. Since then, the cobwebs have only gotten thicker; it seems fair to say that bpfilter is not an active area of development at this point, and that it seems unlikely to displace nftables anytime soon.

Whether that is the "right" outcome is hard to say. Perhaps the special-purpose virtual machine used by nftables is a better solution to this particular problem than the more general BPF. Or possibly nftables came out on top simply because the developers behind it continued to show up and push the project forward. One of the keys to success in kernel development is simple persistence; that is doubly true for a critical subsystem like packet filtering, where it is more than reassuring to know that the developers are in it for the long haul.

Index entries for this article
Kernel	Networking/Packet filtering
Kernel	Nftables

Nftables reaches 1.0

Posted Aug 27, 2021 15:46 UTC (Fri) by magfr (subscriber, #16052) [Link] (1 responses)

I suppose it was inevitable. BPF is used for absolutely everything in the kernel with one exception:
Packet filtering.

Nftables reaches 1.0

Posted Aug 27, 2021 18:11 UTC (Fri) by aszs (subscriber, #50252) [Link]

As amusing as that would be, it doesn't seem likely given https://lwn.net/Articles/858173/, for better or worse...

Nftables reaches 1.0

Posted Aug 27, 2021 15:57 UTC (Fri) by johill (subscriber, #25196) [Link] (2 responses)

Couldn't you kind of compile NFT to BPF?

Today NFT has a whole bunch of 'eval' methods, so to compile to BPF you just need to have a function that returns a few BPF instructions instead. Where not implemented, provide a BPF helper function that calls the existing eval function from BPF.

It doesn't even seem that hard, and if you implement the most commonly used 'eval' methods directly and then send the program through the compiler you'll probably already win something?

Nftables reaches 1.0

Posted Aug 29, 2021 7:12 UTC (Sun) by nilsmeyer (guest, #122604) [Link] (1 responses)

According to the LWN bpfilter article this is already possible with iptables:
https://lwn.net/Articles/747551/

Under: "Bringing in BPF"
> One of the core design features for bpfilter is the ability to translate existing iptables rules into BPF programs.

Nftables reaches 1.0

Posted Aug 29, 2021 7:43 UTC (Sun) by johill (subscriber, #25196) [Link]

I don't think that's how it works - it wants to compile iptables (not nftables) _rules_, but as far as I can tell it has a separate userspace etc. that does all of that, rather than doing a sort of "VM to VM" translation in the kernel I was thinking of (NFT VM to BPF VM)

Nftables reaches 1.0

Posted Aug 27, 2021 22:54 UTC (Fri) by jkingweb (subscriber, #113039) [Link] (6 responses)

I set up the firewall on my home server using nftables as a learning exercise circa 2016 or thereabouts after my ISP deployed IPv6 service, with no significant prior experience in packet filtering. I found the experience decent-but-finicky, with some disappointing duplication of effort to handle both IPv4 and IPv6.

I'm not sure what documentation I was following at the time; it may have led me down suboptimal paths, or things may have improved in the years since. I'll have to give it another look!

Nftables reaches 1.0

Posted Aug 28, 2021 6:47 UTC (Sat) by wtarreau (subscriber, #51152) [Link] (1 responses)

In my opinion it has significantly improved over the years. I'm using it at home as well and it's way better than iptables. There are some places where you still can't merge IPv4 and IPv6 rules, resulting in some duplication effort but I found that it remained reasonable (though more unification would always be welcome of course).

The really nice thing compared to iptables is the instant and atomic load of the rules. No more situation where the nat table loads while the filter table fails etc. And the ability to define objects supporting lists about everywhere (ports, hosts etc) is great. I used to do that using scripts requiring a more complex language to automatically produce iterations. Now it is natural in the config language.

What still really annoys me is the lack of command-line help. I promised Pablo I would some day send him a patch for this but still failed to find sufficient time to work on it. Having to go to the wiki to figure you need to type "nft list rulesets" after not having used it for 2 months is pretty annoying, especially when you've been used to "iptables -h" providing very detailed syntax information. But this minor user-interface aspect aside, nftables is a great technology that is far closer from the spirit of traffic filtering than ipfwadm, ipchains or iptables could be, making it extremely user-friendly.

It's difficult to adopt it, but it's really worth it. Most of the effort is to convert the existing config. I would strongly encourage new firewall deployments to start with nftables, as it will be much easier than iptables for the first setup, an will not require any conversion.

Nftables reaches 1.0

Posted Aug 29, 2021 3:59 UTC (Sun) by josh (subscriber, #17465) [Link]

I definitely like nftables better than iptables, both for atomicity and for syntax.

But I do wish the documentation was much better, especially the documentation for the kernel-to-userspace interfaces.

Nftables reaches 1.0

Posted Aug 28, 2021 15:41 UTC (Sat) by hailfinger (subscriber, #76962) [Link] (3 responses)

I had the questionable idea to set up a nftables based packet filter on various Debian Buster systems in 2020/2021 because nftables was declared to be the future. Lessons learned:
- The syntax is nice once you get used to it and I think most of it is more easily readable due to the structure
- The documentation was incomplete, especially for NAT
- If the documentation says that two ways to specify a rule are equivalent you should verify that instead of blindly rewriting working rules
- Concatenations are cool, but rarely work
- Order within in a single rule matters sometimes
- Combining the same rule from "table ip nat" and "table ip6 nat" into "table inet nat" only works in some cases
- If your kernel and the nftables userspace are not the same age you will run into problems, so either upgrade both or none, this may be different now that 1.0 is released
- Kernel 5.10 is roughly where most of the interesting functions start working if your userspace is new enough and nftables 0.9.6 (Buster Backports) is similarly a point where things start working better
- On Debian Buster without backports the whole thing is really painful, it's manageable with backports
- Priorities as keywords (introduced in 0.9.6) instead of priorities as numbers helps a lot with readability compared to older versions
- Error messages exist, but in netfilter 0.9.6 (from 2020) they were as helpful as gcc error messages ("error: expected ‘asm’ or ‘__attribute__’" instead of "missing semicolon") from the era before llvm, they are a bit better now

Overall, I think nftables has a nice future ahead and I'm looking forward to testing nftables 1.0.

Nftables reaches 1.0

Posted Aug 28, 2021 20:25 UTC (Sat) by pbonzini (subscriber, #60935) [Link] (2 responses)

> gcc error messages from the era before llvm,

Ahah, that's actually a coincidence. GCC error messages for C were bad mostly due to the usage of yacc for the parser. When the parser was rewritten as recursive decent in 2004 by Joseph Myers that laid the foundation for improving error recovery. They then finally improved when GCC developers including myself got fed up of a few particularly egregious cases[1][2].

But competition with llvm wasn't particularly involved. In fact for C++ (which used recursive descent since before clang was started) error message quality has always been comparable to clang.

More recently (and long after I had stopped working on GCC), David Malcolm did a huge amount of work on caret diagnostics, where GCC's front ends were indeed lagging behind. But that's a different story.

[1] https://gcc.gnu.org/legacy-ml/gcc-patches/2010-10/msg0261...

[2] https://gcc.gnu.org/legacy-ml/gcc-patches/2010-11/msg0180...

Nftables reaches 1.0

Posted Aug 28, 2021 21:24 UTC (Sat) by Paf (subscriber, #91811) [Link] (1 responses)

My experience with C++ errors around 2010-2011 (with whatever was in Ubuntu at the time) was there was a lot of multi-page spew when using even fairly simple templates?

Nftables reaches 1.0

Posted Aug 30, 2021 13:22 UTC (Mon) by pbonzini (subscriber, #60935) [Link]

I have never done serious C++, but I think the issue there was that the error messages were overly precise and expanded the same template typenames over and over. At some point a couple tweaks were made, teaching the compiler about default template arguments and typedefs.

In C, the problem was abysmal error recovery, causing dozens of cascaded errors for a single missing semicolon or fat-fingered type name (such as "intt" or "unsgined char"). With a recursive descent parser it's relatively easy and maintainable to add heuristics that look ahead and insert missing tokens or fix things up as necessary. For example if you see two consecutive unknown identifiers, it's likely that the first is a misspelled type and the second is a variable name. With some luck, that will remove a lot of errors involving that variable, because the compiler now knows about it and treats it as declared.

Nftables reaches 1.0

Posted Aug 29, 2021 13:49 UTC (Sun) by moorray (subscriber, #54145) [Link] (1 responses)

> McHardy decided he had more interesting opportunities to pursue in courtrooms. In 2013, though, Pablo Neira Ayuso restarted the project

Is this timeline right? I remember wondering if Patric got unhinged *because* Pablo’s implementation got picked over his.. The quote makes it sound like Pablo came after. It was before my time tho.

Nftables reaches 1.0

Posted Aug 30, 2021 19:59 UTC (Mon) by armijn (subscriber, #3653) [Link]

Yes, this timeline is right. McHardy's contributions took a very sharp dive around September 2011 and he was only sporadically active in the years after that, with almost nothing happening in 2012. Pablo's contributions didn't start increasing until after McHardy dropped out.

Nftables reaches 1.0

Posted Aug 30, 2021 6:46 UTC (Mon) by carORcdr (guest, #141301) [Link]

"The nftables project to replace the kernel's packet-filtering subsystem has its origins in 2008, but is still not being used by most (or perhaps even many) production firewalls. "

Glad to see a substantive article on nftables and iptables. Do you have any numbers on the use in production firewalls?

In Rusty's words:

When your Linux box is the only thing between the chaos of the Internet and your nice, orderly network, it's nice to know you can restrict what comes tromping in your door.

Rusty Russel, Linux IPCHAINS-HOWTO, v1.0.8 (2000-07-04)

Nftables reaches 1.0

Posted Aug 30, 2021 9:13 UTC (Mon) by taladar (subscriber, #68407) [Link] (6 responses)

I just had a look at the CLI command nft and it still seems extremely unpolished

When I call ntf --help I get

> Usage: nft [ options ] [ cmds... ]
> [...]

but not a single command is listed in the help output, nor another command/option that would display that information.

When I try ntf help I get

> Operation not permitted (you must be root)
> Error: syntax error, unexpected newline, expecting string
> help
> ^

which seems like a weird mix of errors and also "unexpected newline" is an odd error to emit for commandline parameters, not to mention that it is far too low level in general.

There is also no obvious option in the --help output to list the currently active ruleset.

On top of that, since firewalls are quite complex we will be unlikely to maintain an iptables and an nftables version of our rulesets in our Puppet configuration management so a working and usable and fully featured version will have to be part of the oldest distros we use before it is even something to consider, so I would imagine nothing will happen before about 2030 since the current version doesn't really look usable yet.

Nftables reaches 1.0

Posted Aug 30, 2021 14:35 UTC (Mon) by nybble41 (subscriber, #55106) [Link] (5 responses)

The full list of commands is in the nft(8) manual page; it would be too long to include in the --help output. If you're using iptables-nft (which the default iptables backend in Debian starting with Buster) then you can list your current iptables rules in nft syntax with the command "sudo nft -s list ruleset".

Nftables reaches 1.0

Posted Sep 1, 2021 18:58 UTC (Wed) by Chousuke (subscriber, #54562) [Link] (4 responses)

Unfortunately the nft manual page seems to take after the iproute2 suite of tools in being extremely light on examples and leaving the reader to figure out how to put things togethers from rather loosely organized grammar descriptions and tables. You basically have to guess how to use it.

For example, if you wanted to know how to perform a 1:1 nat for an entire IP prefix, the manual page would not help because it doesn't even mention that you can use bitwise operators (&, |) with netmasks to perform calculations and modifications on packet fields.

I know there's a partial sentence somewhere on the wiki page that indirectly hints at this being possible because I found it some time ago when I had to do prefix translation, but I can't find it anymore.

nftables is capable, but its documentation makes me sad. It's unbeliveably bad.

Nftables reaches 1.0

Posted Sep 1, 2021 19:27 UTC (Wed) by Chousuke (subscriber, #54562) [Link] (1 responses)

Replying to myself since I can't edit to give an actual example:

I tried finding the relevant documentation from the wiki page but I can't; I've forgotten where I found it the last time. The manual page says "Expressions can be combined using binary, logical, relational and other types of expressions", but *nowhere* does it detail what those expressions "binary", "logical" or "relational" expressions are. It doesn't even contain the word "operator".

I did find out that man libnftables-json at least lists "binary operations", but there's no context.

Just in case someone ends up needing it, you can do stuff like this:

ip daddr 10.240.1.0/24 dnat to ip daddr & 0.0.0.255 | 10.140.7.0;

I don't even remember how I figured that out the first time, but it wasn't thanks to the documentation.

Nftables reaches 1.0

Posted Sep 9, 2021 4:48 UTC (Thu) by chaispaquichui (guest, #77035) [Link]

Very useful, thanks !

Nftables reaches 1.0

Posted Sep 2, 2021 5:19 UTC (Thu) by carORcdr (guest, #141301) [Link] (1 responses)

I can appreciate the concern for the lack of examples, but if you actually look at all the manual pages for the 100+ programs (arpd...tipc-socket) there are actually a significant number of examples. If I decide to list them I will update this comment.

There are many non-iproute2 programs, including significant ones, that have far fewer examples. Some have null.

My definition of an example in the context of a program is a command string--

$|# program argument[s] file|filepath

I realize some may limit the definition of string to alphabetic characters. I do not. My definition of string is a string of characters--alphabetic, numeric and/or symbolic.

Nftables reaches 1.0

Posted Sep 3, 2021 19:10 UTC (Fri) by Chousuke (subscriber, #54562) [Link]

Maybe picking on iproute2 was a bit unfair; I just remembered spending a lot of time trying to decipher the dense synopsis notation way back when. Taking another look, they're definitely better than what nft has

Lately I've felt a bit spoiled by OpenBSD manual pages. If you want to know what good documentation with man pages can look like, you can take a look at some of them. If everything were documented to the same standard I would never need Google...

For example, If I want a quick overview on how OSPF works, I can just "man ospfd" on OpenBSD. The explanation may not strictly speaking have much to do with configuring ospfd itself, but well-placed context "fluff" is a huge quality-of-life improvement as it helps me understand the kinds of problems I can solve with the software.

Nftables reaches 1.0

Posted Aug 30, 2021 11:22 UTC (Mon) by evgeny (subscriber, #774) [Link]

Writing and maintaining a complex FW config by hand is really a pain - especially when more than one firewall is involved and part of the rules must be kept in sync. Long ago, I started using fwbuilder. There is no support for nftables yet, but hopefully, that will change once nft becomes mainstream.

Nftables reaches 1.0

Posted Sep 1, 2021 18:17 UTC (Wed) by flussence (guest, #85566) [Link] (1 responses)

I want to like nftables, honestly, but after years of using it it's still incredibly sharp and brittle for something that's supposed to someday supplant the firewall everyone currently uses.

I'd write out a laundry list of the snags I hit regularly but it turns out one already exists (https://bugzilla.netfilter.org/show_bug.cgi?id=1461). I've got a fail2ban setup that barely works; sometimes after a few hours of operation it refuses to add an address to a set that doesn't contain it (this smells like unhandled hash collision... bug 1392?) — much worse is that sometimes adding a /32 is randomly and silently corrupted into a range covering half the ipv4 internet (bug 1438 - note that it's happened to me even though I'm not setting auto-merge). I've found that appending a literal "/32" to the input prevents the latter, but I don't understand why.

In spite of that I'll continue to use it because it's easier to reason about rules that look like C instead of COBOL. The fundamental design at least seems sound and none of my gripes are unsolvable, I just wish I didn't have to handhold it so much.

Nftables reaches 1.0

Posted Sep 9, 2021 4:13 UTC (Thu) by splitice (guest, #154172) [Link]

I'm in a similar camp. As someone who maintains and has developed alot of iptables modules I can certainly see the room for improvement, but I can't help but think that nftables made just as many steps forward as it did steps backwards.

Fingers crossed bpfilter will hit the mark better.

Nftables reaches 1.0

Posted Sep 2, 2021 14:01 UTC (Thu) by ecree (guest, #95790) [Link]

Regarding bpfilter and its stagnation, I have a little story to tell. Back when bpfilter was new, Davem asked me if I'd lend a hand with the code generator (in the user-mode blob that translates iptables rulesets to BPF programs); I replied that I'd like to but that I couldn't find the documentation of the iptables uAPI/ABI and I didn't know it well enough to work without docs. (include/uapi/linux/netfilter_ipv4/ip_tables.h is… unenlightening.)

I heard nothing back, leading me to suspect that maybe the problem is that no-one *else* can remember all the corners of iptables either. 'The implementation is the spec' is fine until you want to replace the implementation.