User: Password:
Subscribe / Log in / New account

Ghosts of Unix past, part 3: Unfixable designs

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 18, 2010 4:27 UTC (Thu) by paulj (subscriber, #341)
In reply to: Ghosts of Unix past, part 3: Unfixable designs by neilbrown
Parent article: Ghosts of Unix past, part 3: Unfixable designs

That the rendezvous address is (potentially) at a higher level than the end-point address is normal layering. For any given layer that provides some kind of addressing semantics, there can always be another layer above it that implements richer addressing and must map its richer addresses down to the lower layer. That's good and normal.

So to look for conflation in networking addressing you probably need to stay within a layer. E.g. within IP, there is conflation in addressing because each address encodes both the identity of a node and its location in the network. Or perhaps more precisely: IP addressing lacks the notion of identity really, but an IP address is the closest you get and so many things use it for this. This may be fixed in the future with things like Shim6 or ILNP, which separate IP addressing into location and identity. This would allow upper-layer protocols like TCP to bind their state to a host identity, and so decouple them from network location.

Variable length addresses would have been nice. The ISO packet protocol CLNP uses variable length NSAP addresses. However, hardware people tend to dislike having to deal with VL address fields. The tiny address space of IPv4 perhaps needn't have been unfixable - it could perhaps have been extended in a semi-compatible way. However it was decided (for better or worse) a long time ago to create IPv6.

Possibly another problem with IP, though I don't know where it fits in your list, is multicast. This is an error of foresight, due to the fact that multicast still had to be researched and it depended on first understanding unicast - i.e. IP first had to be deployed. The basic problem is that multicast is bolted on to the side of IP. It generally doesn't work, except in very limited scopes. One case is where it can free-ride on existing underlying network multicast primitives, i.e. ones provided by local link technologies. Another is where a network provider has gone to relatively great additional trouble to configure multicast to work within some limited domain - needless to say this is both very rare and even when done is usually limited to certain applications (i.e. not available generally to network users). In any new network scheme one hopes that multicast services would be better integrated into the design and be a first-class service alongside unicast.

Another retrospectively clear error is IP fragmentation. It was originally decided that fragmentation was best done on a host by host basis, on the assumption that path MTU discovery could be done through path network control signalling and that fragmentation/reassembly was a reasonably expensive process that middle-boxes ought not to be obliged to do. IMO this was a mistake: path MTU signalling turned out to be very fragile in modern deployment (IP designers didnt anticipate securo-idiocy); it turned out fragmentation/reassembly was relatively cheap - routers routinely use links both for internal buses and external connections which require fragmenting packets into small fixed size cells. As a consequence of the IP fragmentation choices, the IP internet is effectively limited to a (outer) path MTU of 1500 for ever more, regardless of changes in hardware capability. This causes problems for any IP packet protocol which wants to encap itself or another. One imagines that any new network scheme would learn from the IP MTU mess, make different trade-offs and come up with something better and more robust.

We should of course be careful to not overly condemn errors of foresight. Anticipating the future can be hard, particularly where people are busy designing cutting-edge new technology that will define the future. ;)

(Log in to post comments)

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 18, 2010 8:13 UTC (Thu) by Cato (subscriber, #7643) [Link]

One performance improvement of IPv6 is that it has a much more regular IP header structure, which involves a lower cost for hardware-based forwarding. has a good summary of the benefits of IPv6 including this one.

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 19, 2010 1:15 UTC (Fri) by dlang (subscriber, #313) [Link]

that sounds like something that was a really big deal when IPv6 was created, but with the increased processor speeds we have now, not nearly as important.

this isn't just that clock speeds are higher, but that the ratio of clock speeds to the system bus speeds is no longer 1:1, this means that it's possible to execute far more steps without slowing the traffic down.

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 19, 2010 11:15 UTC (Fri) by job (guest, #670) [Link]

If you're switching IP in hardware, it makes your design simpler and faster.

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 19, 2010 11:41 UTC (Fri) by Cato (subscriber, #7643) [Link]

IP routers have not done CPU-based forwarding as the main path for a long time - the largest one is probably the Cisco CRS-3 which forwards 322 terabits per second when fully scaled (, but even quite low end routers now also use hardware forwarding (i.e. ASICs, network processors, etc, not CPU).

You can probably manage to forward anything in hardware, but it helps somewhat that IPv6 has a regular header design.

IPV6 and hardware-parseable IP headers

Posted Nov 19, 2010 23:26 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

I don't think CPU speed per se (how fast a single CPU is) is relevant. It's all about cost, since most IP networks are free to balance the number of CPUs, system buses, network links, etc.

And from what I've seen, as the cost of routing in a general purpose CPU has come down, so has the cost of doing it in a specialized network link processor (what we're calling "hardware" here) -- assuming the IP header structure is simple enough. So today, as ten years ago, people would rather do routing in an ASIC than allocate x86 capacity to it.

I think system designers balance system bus and CPU speed too, so it's not the case that there are lots of idle cycles in the CPU because the system bus can't keep up with it.

Ghosts of Unix past, part 3: Unfixable designs

Posted Dec 3, 2010 9:05 UTC (Fri) by paulj (subscriber, #341) [Link]

FWIW, major router vendors still use software routing for many of their lower-end enterprise routers, even some mid-range.

Ghosts of Unix past, part 3: Unfixable designs

Posted Nov 19, 2010 11:18 UTC (Fri) by job (guest, #670) [Link]

Path MTU discovery is indeed broken in practice.

One thing I never really understood is why TCP MSS is a different setting from MTU. Given the belief that the MTU could be auto detected, MSS could be deduced from it.

Perhaps someone can enlighten me?

Copyright © 2018, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds