User: Password:
|
|
Subscribe / Log in / New account

Xtables2 vs. nftables

Xtables2 vs. nftables

Posted Jan 13, 2013 11:50 UTC (Sun) by malor (guest, #2973)
Parent article: Xtables2 vs. nftables

while nftables defines a new virtual machine to process packets.

Hmm. While conceptually nice, what's that going to do to throughput? Are they going to be JIT-compiling it? And, if so, has anyone thought about security implications?

It seems to me that iptables, even with its internal warts, is one of the best features in Linux, both powerful and extremely fast. Throwing away a good design because it's old has a strong flavor of NIH. Doing a virtual machine just to do one seems pretty silly to me; what would the specific advantages be? If it's for weird packet mangling, is the overhead of a virtual machine worth carrying around to handle those corner cases better? Or would they be better served by userspace code of some kind?


(Log in to post comments)

Xtables2 vs. nftables

Posted Jan 14, 2013 21:26 UTC (Mon) by intgr (subscriber, #39733) [Link]

Disclaimer: I'm not a kernel developer, this is simply my understanding of things based on experience with iptables.

> what would the specific advantages be?

Because that's the natural way people want to write firewall rules.

Right now each firewall rule has to stand on its own and you get no control over which order certain terms are evaluated. For example, if you want to whitelist SSH connections from 3 different IP addresses, you basically have to write the firewall like:

if (net_layer == IP && ip_address == 1.2.3.4 && transport_layer == TCP && tcp_port == 22) { ACCEPT; }
if (net_layer == IP && ip_address == 2.2.2.2 && transport_layer == TCP && tcp_port == 22) { ACCEPT; }
if (net_layer == IP && ip_address == 3.3.3.3 && transport_layer == TCP && tcp_port == 22) { ACCEPT; }
if (net_layer == IP && transport_layer == TCP && tcp_port == 22) { DROP; }

Not only is this annoying to write and manage, but it's also very inefficient. Clearly any sane person would instead write:

if(net_layer == IP && transport_layer == TCP && tcp_port == 22) {
  if (ip_address == 1.2.3.4 || ip_address == 2.2.2.2 || ip_address == 3.3.3.3) { ACCEPT; }
  DROP;
}

And while you can use separate rule chains to abstract out these patterns, that's like going back in time many decades of computer programming and using "goto" statements for all your control logic.

For this particular use case, you could also use the iptables "ipset" module (which can match a set of IP addresses in one rule), but that's more of a workaround for the shortcomings of iptables: It requires a separate user space utility then to manage these custom named IP address sets via a separate kernel API. There are tons and tons of these special case modules.

There's also the problem that currently, every different kind of rule requires support in user space (to parse the command line and serialize it for the kernel) AND in the kernel (to deserialize the data and do the matching specific to this rule). Basically 95% boilerplate and 5% substance -- waste of developer resources, memory, CPU cache, etc.

It would be a lot more flexible to provide an abstract virtual machine in the kernel and let the user space generate whatever code it needs to support the protocol it wants. That's how bpf already works in the kernel, for packet capture and seccomp system call filters.

> iptables, even with its internal warts, is one of the best features in Linux, both powerful and extremely fast.

As a programmer, I would say extremely contrived and inefficient. I'm genuinely surprised it has survived for this long.

Xtables2 vs. nftables

Posted Jan 14, 2013 21:54 UTC (Mon) by paulj (subscriber, #341) [Link]

You're badly misrepresenting iptables though. The tables are NOT like goto, they're like functions which can return to the calling chain, in addition to terminating rule processing for the packet. So your iptables example can be factored in several ways. E.g.:

accept_allowed_ssh_hosts () {
  if (proto != tcp)
    RETURN;
  if (port != ssh)
    RETURN;

  if (ip == 1.2.3.4) ACCEPT;
  if (ip == 2.2.2.2) ACCEPT;
  if (ip == 3.3.3.3) ACCEPT;
}

And somewhere in INPUT:

accept_allowed_ssh_hosts ();
…
DROP;

Note also the existing iptables language could be compiled to something suitable for a JIT. If there's any control-flow it is missing, it could be added, without throwing away the interface that is there today.

Xtables2 vs. nftables

Posted Jan 14, 2013 23:24 UTC (Mon) by intgr (subscriber, #39733) [Link]

Good point, I never thought of structuring my rules this way. It's better, but it requires you to artificially split things into separate chains and specify lots of things using negative logic, which is far from natural.

I just went the easy route and use FERM to translate between my brain and iptables.

Xtables2 vs. nftables

Posted Jan 15, 2013 5:41 UTC (Tue) by malor (guest, #2973) [Link]

Right now each firewall rule has to stand on its own and you get no control over which order certain terms are evaluated.

Are we talking about the same thing? The language you're using in your examples isn't anything I recognize as being iptables-related, and I don't see anything being done with chains, which is kind of the point of the whole system. Are you confusing it with something else, maybe?

In iptables, there are five root chains in the network stack: PREROUTING, FORWARD, INPUT, OUTPUT, and POSTROUTING, plus any arbitrary number of user chains inserted wherever one likes. Typically, the great majority of the work is done on the FORWARD and INPUT chains.

Because you can have any number of chains, it's fairly typical to 'divide and conquer'; that is, test if it's a TCP packet, and jump to a TCP chain, which then checks for port matches, and then makes decisions. And this full evaluation process is not normally followed for every packet; typically, packets that match the keywords ESTABLISHED and RELATED are short-circuit accepted, without any further processing, and this basically consists of a lookup in a connection table. So it's really fast with most of the packets in a session (usually all but the first couple). Novel packets, ones that either signify a new connection or are unwanted, are usually navigating down a tree of tests, which means that any given packet won't usually need very many decisions. I imagine this condenses down into quite a short number of actual hardware instructions. Whatever the internals actually look like, it certainly seems efficient, as a Linux router/firewall is able to move a very large amount of traffic without needing dedicated hardware support.

So, a virtual machine is cleaner, but probably less efficient. And I'm wondering if a general code cleanup on the existing system might not end up being better. I cheerfully concede that it's ugly as hell, but it seems very, very fast. That's an absolutely critical feature in firewalls, perhaps the crucial feature, after being able to do basic stateful inspection.

The nastiness with having to pass off to user processes for advanced inspection is something that only people who want that functionality have to deal with, where putting a VM in there may potentially slow everyone down, making all of us pay for a few corner cases that most of us have no interest in.

All I really care about is speed, so if they can make the VM run as fast as regular iptables, then I have no other objection. It's not like my objection really matters anyway, I don't suppose, since I'm not writing the code, but still.

Xtables2 vs. nftables

Posted Feb 4, 2013 20:35 UTC (Mon) by jengelh (subscriber, #33263) [Link]

>Right now each firewall rule has to stand on its own and you get no control over which order certain terms are evaluated.

Rule evaluation order is very well defined, it follows the usual left-to-right evaluation with short-circuit semantics like the && operator in C.

>It would be a lot more flexible to provide an abstract virtual machine in the kernel and let the user space generate whatever code it needs to support the protocol it wants. That's how bpf already works in the kernel,

iptables is already a VM of sorts. In addition, remember that xt_u32 has been in the kernel for a long time, and it looks like we will be gaining xt_bpf shortly as well.

But none of them is meant to deal with low-performing rules. If you test a condition multiple times, BPF should be doing it. If there is any static optimization such as common subexpression elimination to be done, then, I.M.H.O., userspace should be doing that before passing on the filter data to the kernel.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds