5.3 Merge window, part 1

Posted Jul 13, 2019 10:20 UTC (Sat) by sytoka (guest, #38525)
In reply to: 5.3 Merge window, part 1 by jem
Parent article: 5.3 Merge window, part 1

If IPv6 had been retro compatible with IPv4, no problem of this type would have happened ...

5.3 Merge window, part 1

Posted Jul 13, 2019 12:02 UTC (Sat) by ju3Ceemi (subscriber, #102464) [Link]

Is there a solution for an "ipv6-like" that stay retro-compatible with ipv4 ?

From my point of view, there is nothing you can do to let ipv4 devices talks to a wider addr space : the IP is nothing but an unique identifier, which de facto makes your wish unrealizable

5.3 Merge window, part 1

Posted Jul 13, 2019 13:51 UTC (Sat) by farnz (subscriber, #17727) [Link] (13 responses)

I hear that a lot from armchair designers, but I've yet to have one enumerate an actual way for IPv6 to be retro-compatible with IPv4 that isn't present in IPv6 already.

IPv4 addresses are a subset of IPv6 addresses - if you have an IPv4 address, then it also exists in (at least) five different forms for different (mostly failed) compatibility mechanisms in IPv6:

As an "IPv4-Compatible IPv6 Address" in ::/96 - e.g. 192.0.2.1 is also ::c000:21. The transition mechanisms that wanted to use this form of addressing (e.g. the original form of SIIT) never reached deployable state, and thus this form is now deprecated.
As an "IPv4-Mapped IPv6 Address" in ::ffff:0:0/96 - e.g. 192.0.2.1 is also ::ffff:c000:21. This is still in use, so that applications can pretend that IPv4 does not exist, and use these to refer to IPv4 addresses (when talking to the network stack, and in some protocols).
In 64:ff9b::/96 for NAT64 and 464XLAT purposes - e.g. 192.0.2.1 is also 64:ff9b::c000:21. This allows you to communicate over an IPv6-only network with a central IPv4 host that does NAT for you, getting you the CGNAT experience when talking to IPv4 hosts; this is deployed in (at least) T-Mobile USA and EE UK mobile networks.
Lots of times in 64:ff9b:1::/48 (depending on provider) to allow you to have multiple NAT64 or 464XLAT deployments addressed differently.
As a 6to4 prefix in 2002::/16, for example 192.0.2.1 gets you control of all of 2002:c000:21::/48. This lets you route IPv6 as an overlay on your existing IPv4 network, but is not popular because properly deployed IPv4 will always be at least as fast as or faster than IPv6 carried over an IPv4 network.

On top of that, SIIT got reworked into SIIT-DC, which allows a pure IPv6 network to choose a prefix for IPv4 use and do stateless NAT instead of NAT64. And, of course, you can do NAT64 or 464XLAT in a private prefix.

Each of these mechanisms has its own set of problems as compared to running dual stack networks; 464XLAT and NAT64 are only happening now because in the mobile world, operators are beginning to experience pain from running NAT'd IPv4, and reducing the need for NAT via IPv6 saves actual dollars. And, if nothing else, pure IPv4 has the advantage of zero change needed - any alternative IPv6 proposal needs to cope with IPv4-only hosts that refuse any form of change to support IPv6, otherwise you face the same problems as SIIT, NAT64, and 6to4 do.

5.3 Merge window, part 1

Posted Jul 13, 2019 16:11 UTC (Sat) by plugwash (subscriber, #29694) [Link] (12 responses)

6to4 has two big problems.

1. Routers don't relay by default, the relaying is handled by specific relay routers, which most ISPs don't implement.
2. It doesn't work with NAT.

A mechanism similar to 6to4 but designed to work with NAT (not to fight against it like teredo does) and that was implemented by default as part of every dual stack router and OS would IMO have eased deployment considerably.

5.3 Merge window, part 1

Posted Jul 13, 2019 16:25 UTC (Sat) by farnz (subscriber, #17727) [Link] (11 responses)

The lack of relay routers will exist for any overlay network - and note that if you have IPv4 on both ends, you can route to 2002::/16 over IPv4 directly, not needing a relay router. There's no particular reason why all dual-stack routers weren't relay routers, except that operators did not want to support any IPv6 transition mechanism at all - not even 6to4.

And Teredo is as good as you can get given the way NAT works; fighting NAT is the norm when you're trying to tunnel through it.

As I said, it's easy to set the goals for a better mechanism, it's a lot harder to actually design it.

5.3 Merge window, part 1

Posted Jul 14, 2019 2:55 UTC (Sun) by plugwash (subscriber, #29694) [Link] (10 responses)

> The lack of relay routers will exist for any overlay network

Yes it will for any overlay network invented *now*.

If such encapsulation had been part of the core protocol from the start and relaying behaviour had been mandated or at least strongly recommended as part of every dual stack router then we wouldn't have the mess of relay shortages we have today.

> fighting NAT is the norm when you're trying to tunnel through it.

The alternative to fighting the NAT is to go with the flow of it, that is when deencapsulating. modify the V6 addresses to match the v4 addresses/ports.

Of course such an idea is unthinkable to the "end to end is sacred" crowd.

5.3 Merge window, part 1

Posted Jul 14, 2019 3:18 UTC (Sun) by gus3 (guest, #61103) [Link]

The "end to end is sacred" crowd put far too much trust in the network at-large. Replace one switch with one hub, and the whole "sacred" thing gets scrambled into "scared" instead.

That crowd should shift their thinking, to "just make sure the damn thing keeps working the way it did."

5.3 Merge window, part 1

Posted Jul 14, 2019 10:17 UTC (Sun) by farnz (subscriber, #17727) [Link] (8 responses)

I'm not talking about overlay networks invented now - I'm talking about overlay networks invented back in 1999, when IPv6 was new. And 6to4 encapsulation was strongly recommended as part of every dual stack router back then; the only error, IMO, in 6to4, was the effort expended to try and get people to run public relays, when we would have been better off with 2002::/16 containing the entire IPv4 routing table over time, and with everyone routing their own subset of 2002::/16, on the basis that when native IPv6 arrives, we'd be able to turn it off.

Further, note that there's no-one able to mandate relay operation - operators want to control what they offer, and won't offer free services just for the sake of it, especially not expensive ones like relays.

Going with the flow of NAT is what NAT64 (called NAPT in early IPv6 documents) does. It's not been deployed until recently because provides zero gain unless you're planning to turn off IPv4 for large swathes of your network.

It's perhaps worth remembering that the whole reason Teredo exists is that Microsoft wanted an easy way for game developers to write NAT traversal multiplayer games; Teredo was their answer, in that game developers just write IPv6-only games ignoring the existence of NAT, and Microsoft handles the NAT traversal problem for you in Teredo.

Again, though, I don't see constructive answers on how IPv6 could have been technically better at retrocompatibility - just claims that it can't be fixed (despite the fact that the very fix you're suggesting was in IPv6 in 1999, and ignored by network operators), and an insult to the people working on this stuff. What protocol changes would you have made that make IPv6 more compatible with IPv4?

5.3 Merge window, part 1

Posted Jul 14, 2019 10:37 UTC (Sun) by ianmcc (subscriber, #88379) [Link] (7 responses)

Make the IP address variable length; IPv4 would just be the special case where the address is 4 bytes.

5.3 Merge window, part 1

Posted Jul 14, 2019 11:02 UTC (Sun) by farnz (subscriber, #17727) [Link] (6 responses)

That has two problems, one inherent to variable length addressing, and one an upgrade problem, and also ignores the fact that IPv4 is already a special case of IPv6.

Until the last IPv4-only host has upgraded to support variable length addressing, all hosts need to have a 4 byte address. Like with IPv6 deployment, this is a chicken-and-egg problem; why would I choose to have a 5 byte or longer address when I could stick to 4 byte addresses and not have any interop issues?
Hardware-assisted routing has to handle addresses in fixed size chunks, and the routing delay is proportional to the number of chunks the hardware handles (i.e. the smaller the chunk, the bigger the penalty for a long address). With a variable length address, as CLNP proved with its NSAPs, there is an incentive to keep to short addresses, because the unfortunate souls with longer addresses have slower networking. In turn, this doesn't resolve the address shortage for the long term - there's always incentive to have "IPv4" addresses.

In IPv6, all addresses have same hardware latency, and IPv4 is a special case of IPv6 anyway (this fact is used by SIIT-DC with a per-DC IPv6 /96 matching all of IPv4, and was the idea behind SIIT).

5.3 Merge window, part 1

Posted Jul 15, 2019 11:45 UTC (Mon) by ianmcc (subscriber, #88379) [Link] (5 responses)

Problem 1 is common to IPv6 as well. Legacy devices can go behind a NAT router, which basically everything is already.

Done properly, I don't think variable length addresses need to have a performance penalty, and indeed it might end up faster. Eg, if my ISP is allocated the address A.B.C, then they allocate their customers addresses of the form A.B.C.D. The routing tables only need to refer to A.B.C, and the IPS's routers only need to look up on D. My home network can use addresses of the form A.B.C.D.E (or additional devices hanging off something can get A.B.C.D.E.F et cetera). This doesn't change the lookup time for the upstream routers because they just ignore the parts of the address that are not relevant for them.

5.3 Merge window, part 1

Posted Jul 15, 2019 13:30 UTC (Mon) by farnz (subscriber, #17727) [Link]

While problem 1 is common to IPv6 during the transition period, the problem with variable length addresses with 1:1 IPv4 compatibility for the source + destination are 4 bytes case is that the transition period is effectively infinite - there is never a penalty for refusing to migrate, whereas in IPv6 land, there is a penalty for failure to migrate once a tipping point is reached. For example, today it is the case that if you care about the performance of your servers when accessed via a mobile phone, you need to support IPv6, because for significant subsets of mobile users, IPv4 goes via a remote NAT, while IPv6 takes the shortest route.

And variable length addresses always have a performance or cost penalty in hardware, which never goes away. For a fixed size address, the router simply reads the address and acts on it. For a variable length address, the router has to read the length, read the first chunk of the address, mask off any parts of the first chunk of address that aren't valid, attempt to act on it, and then if the needed part of the address is longer than the chunk, repeat for the next chunk. Worse, if you're not cautious, router manufacturers will attempt to "get away" with not handling the full complexity - e.g. only route on the first N bits, and ignore the rest of the address - and if those routers become common, you've effectively shrunk the routable component of the address. We've seen this in IPv4 in the 80s, where routers fell back to a slow path if the routing prefix was too long (more than 16 bits), and we've seen this in IPv6 routers that only route on the first 64 bits of the address. Variable length addressing just makes this harder, because you also have to handle the pain that 32 bit "1.1.1.1" is not guaranteed to route to the same place as 64 bit "1.1.1.1/32", which is not guaranteed to route to the same place as 128 bit "1.1.1.1/32" (well, unless you remove the requirement that 32 bit "1.1.1.1" routes to the same place as IPv4 "1.1.1.1").

This extra complexity is inherent to variable length addressing, and makes the hardware more complex; in turn, this means that you either need more complex hardware to handle lookups in the same number of clock cycles, or you need more clock cycles to do the same lookup. Fixed length addresses avoid this - you always read a fixed size chunk and then act on it.

5.3 Merge window, part 1

Posted Jul 15, 2019 13:46 UTC (Mon) by excors (subscriber, #95769) [Link]

> Eg, if my ISP is allocated the address A.B.C, then they allocate their customers addresses of the form A.B.C.D.

I think the problem is that in practice, strict hierarchical addressing doesn't work. E.g. there's anycast, where the same IP address is advertised by multiple servers around the world, and users will get routed to whichever one is nearest (based on BGP's definition of "nearest"). Or for redundancy you might want one server to advertise a single IP prefix through two ISPs, so if one fails it'll get routed through the other.

Non-hierarchical usage of the IPv4 space has been a known issue for many years, causing significant expansion of routing tables (see e.g. https://bgp.potaroo.net/). That's quite a problem when routers store the table in expensive content-addressable memory (for efficient lookups), and the table size grows too large for the hardware.

There's a more fundamental issue with IP addresses being both "locator" and "identifier". Originally they were seen as locators, i.e. a hierarchical address that describes how to find the server with increasing specificity, with routing based on IP prefixes and CIDR etc. DNS mapped stable identifiers (domain names) onto addresses. DNS didn't work well enough for that, so nowadays IP addresses are often just identifiers and don't indicate anything about the actual location of the server (as with anycast and multihoming), but routing protocols weren't designed to be efficient identifier lookup services. Occasionally people have tried to disentangle the two concepts, like with LISP, but I don't know if they've had any success.

5.3 Merge window, part 1

Posted Jul 15, 2019 14:47 UTC (Mon) by imMute (guest, #96323) [Link] (2 responses)

That's how route aggregation works today. Route lookups are already fast using hardware TCAM. Variable length addresses would make the TCAM implementation harder. Or, more likely, they'd just make the TCAM addresses the max size allowed by the variable length spec. And you'd end up with smaller tables that wasted space.

5.3 Merge window, part 1

Posted Jul 15, 2019 15:10 UTC (Mon) by farnz (subscriber, #17727) [Link]

Note, too, that a variable length address space limited to N bits of address can be mapped into a fixed size address space of size N+1 bits. You add a prefix bit which is 1 if the next N bits are the full address, or 0 otherwise, and do this recursively. You can then unmap by counting leading 0s to retrieve the address size, strip the next 1 bit, and the remainder is the address.

In other words, unless your variable length address is greater than 127 bits in maximum size, it can be entirely mapped into IPv6.

5.3 Merge window, part 1

Posted Jul 16, 2019 23:56 UTC (Tue) by mtaht (subscriber, #11087) [Link]

I've kind of wondered how much of the internet, particularly the IPv6 portion, is actually routed by TCAM based hardware. Software routing in SDR and Linux/BSD based implementations seems to be on the rise.