Fundamentally modifications to TCP are necessary

Posted Mar 30, 2013 10:26 UTC (Sat) by paulj (subscriber, #341)
In reply to: Fundamentally modifications to TCP are necessary by giraffedata
Parent article: Multipath TCP: an overview

At the IP level, nodes are identified by IP addresses and the IP network routes between addresses. This may mean that different packets take different routes, even though they travel from the same source to the same destination IP address. TCP, indeed, generally knows nothing about this.

However, some hosts have multiple network connections. E.g., a common case today is having both a 3G and a wifi connection, each to a completely different network.

There is no way in IP to treat multiple IP addresses as one logical IP, and have IP handle the routing between them. Well, there are some ways (Mobile IP). Unfortunately these solutions rely on tunnelling packets, and the lack of foresight of IP designers and network operators in crippling the Internet to a fixed MTU means tunnelling is not generally practical with the Internet today. (And, sadly, IPv6 makes things worse in this regard :( ).

So, if you want to make use of multiple, disparate, network attachments, you need to do this above IP. Either you must modify TCP, or you must insert another protocol between IP and TCP (e.g. a shim, see shim6). Unfortunately, again down to lack of foresight and (most of all) stupidity of network operators, new IP protocols are not generally practical with the Internet today - too many idiotic middle-boxes will filter it out. So at the TCP level it has to be.

(There's a recurring theme here. If IPv6 fails - and it might - the next generation IP protocol will have to have a fixed preamble that looks exactly like the beginning of an IPv4 + TCP + HTTP packet. Sigh).

Fundamentally modifications to TCP are necessary

Posted Mar 30, 2013 20:25 UTC (Sat) by giraffedata (guest, #1954) [Link] (4 responses)

There is no way in IP to treat multiple IP addresses as one logical IP [address]

The MPTCP alternative I'm thinking about wouldn't do that. The client would use one IP address and the overall IP network would route via either the 3G or wifi network based on its own policies.

For example, the handset has IP address 1.2.3.4 on the wifi network and 4.3.2.1 on the 3G network. All outgoing packets from the handset bear source IP address 1.2.3.4. All packets the server sends to the handset bear destination IP address 1.2.3.4. The server's routing table (or the routing table in its default router) shows that in addition to the normal routes, 4.3.2.1 is a route to 1.2.3.4.

Does that work?

Fundamentally modifications to TCP are necessary

Posted Mar 30, 2013 20:39 UTC (Sat) by dlang (guest, #313) [Link] (2 responses)

The problem is that when the other side needs to reply to you, it sends packets out with a destination of your 1.2.3.4 IP address

every router along the way will have one best path to get to 1.2.3.4, and so all the traffic will go down that path.

It's this best path priority routing that lets the Internet work as well as it does, but it means that if you have two completely different connections, and you want to use them both, you have to split the traffic between two different IP addresses, one for each connection, so that the traffic to you will get split between those connections.

The nice thing with multipath TCP is that it can do this under the covers (in a library or in the kernel).

With multipath TCP widely available, you could even have this implemented in a router. The router would have multiple connections, and proxy the TCP connection into multipath TCP, utilizing the multiple connections that it has completely transparently to the endpoint machine.

Fundamentally modifications to TCP are necessary

Posted Mar 30, 2013 20:42 UTC (Sat) by dlang (guest, #313) [Link] (1 responses)

Clarifying one point that I skipped

Routing on the Internet is done one hop at a time.

your default routers don't talk directly to the server, they talk to other routers that talk to other routers.... that talk to the server.

It's common to have 10-20 routers in the path for some connections (do a traceroute to the server to see the routers that your traffic to the server goes through, and keep in mind that the traffic from the server back to you may go through a different series of routers)

Fundamentally modifications to TCP are necessary

Posted Mar 31, 2013 1:01 UTC (Sun) by giraffedata (guest, #1954) [Link]

Thanks; that explains it. I forgot that every router along the way would be independently choosing a route to the destination.

Fundamentally modifications to TCP are necessary

Posted Mar 30, 2013 22:30 UTC (Sat) by paulj (subscriber, #341) [Link]

So does the intermediary node with the "4.3.2.1 is also a destination for 1.2.3.4" routing entry also send copies of the packet to 4.3.2.1? Or just to 1? If copies, what happens if the copies also go through intermediaries that create copies?

Is this more efficient than just letting the end-stations create additional flows between any additional IP addresses? If yes, how?

Fundamentally modifications to TCP are necessary

Posted Apr 1, 2013 22:41 UTC (Mon) by marcH (subscriber, #57642) [Link] (5 responses)

> So, if you want to make use of multiple, disparate, network attachments, you need to do this above IP. Either you must modify TCP, or you must insert another protocol between IP and TCP (e.g. a shim, see shim6).

The original sin is actually quite simple: it's the lack of layering in TCP/IP. TCP should not "steal" and rely on IP addresses directly. Just like every other layer, the TCP layer should have a its own "host address" and some indirection logic to resolve it to IP layer address(es). This indirection logic would be the most natural place to implement mobility and all the other features Multipath TCP is offering. I'm sure there are a few good research papers explaining this.

Now of course it's way too late for such a dramatic change but keeping in mind the core shortcoming and the very theoretical but "correct" design helps understand all the numerous and more complex workarounds that keep being offered. I find.

Fundamentally modifications to TCP are necessary

Posted Apr 1, 2013 23:14 UTC (Mon) by dlang (guest, #313) [Link]

In theory you could propagate routes for every host to the entire Internet and all hosts could move dynamically.

In practice it just doesn't scale, the overhead of letting the routing tables get that large just doesn't work at acceptable speeds.

Someday this may change.

It has changed for phones. It used to be that the area code and first three digits of the number routed you to specific buildings and then the last digits routed you out from that. so while there are still large chunks of landlines that mostly follow this model, phone number portability and cell phones make it so that any phone number can appear anywhere on the network.

now, the phone system only needs to find this to setup the conversation, not for each packet. This is the "smart network vs dumb network" discussion from above.

Given that the "smart network" of the phone system now tends to run on top of the "dumb network" of the Internet, I think it's pretty clear that the Internet has shows itself to be far superior

If you think about it, the Internet already has the layer of indirection you are talking about, DNS. The problem is that looking things up in DNS is far too slow and updates far too infrequently for it to be used in routing decisions for every packet.

Fundamentally modifications to TCP are necessary

Posted Apr 2, 2013 2:44 UTC (Tue) by giraffedata (guest, #1954) [Link] (3 responses)

Layering doesn't require TCP to have endpoint addressing that doesn't involve IP addresses, because it isn't that kind of layer.

Now if TCP were a network topology layer, then it would need its own addressing and could easily do the kind of routing we're talking about. But I would not expect anyone to have designed TCP that way, because it would be redundant. The basic architecture of the Internet says routing packets to whatever ephemeral link happens to be up now is what the IP layer is for. A TCP driver is supposed to be blissfully ignorant of paths and concentrate on turning a blizzard of packets into an ordered, ungranular, reliable stream.

The problem as I see it is just that IP hasn't evolved in a way that its routing protocols are sufficent for the needs of millions of handsets hopping from one wireless network to another. Considering that the original routing protocols were hardcoded files on each node, that's not surprising.

Fundamentally modifications to TCP are necessary

Posted Apr 2, 2013 7:25 UTC (Tue) by marcH (subscriber, #57642) [Link] (2 responses)

An IP address conflates two things that should be separate: location (where) and identity (who). The latter should be in layer 4.

You don't ask the entire Post office to update ZIP codes when you move house.

It's not me saying it but the whole research community. Look at M-TCP, HIP, GTP (GPRS tunnelling), dynamic load balancing,... they are all try to somehow retrofit this separation in a backward, half-compatible way. Because it's too late it tends not to be pretty.

"As simple as possible, but not simpler" - too simple this time.

Fundamentally modifications to TCP are necessary

Posted Apr 2, 2013 12:17 UTC (Tue) by paulj (subscriber, #341) [Link]

I don't think IP conflates these. It just wasn't an issue on the horizon in the design of IP. That said, the original designers of IP did envision that further addressing schemes (e.g the "associative addressing" Cerf & Kahn referred to in their '74 paper) might be layered over TCP.

Sadly, the designers and implementors that followed chose to prioritise short-term performance concerns over the long-term flexibility of IP. It became effectively impossible to insert new protocols between IP and TCP (in the sense of it having an IP protocol number != TCP).

It might still be possible to insert an identity layer. The lower 64 bits of the IPv6 address could be used for this. Unfortunately though:

a) There's no guarantee IPv6 will succeed

b) Even if it does, there are (as usual) short-sighted people out there pushing to abolish the split in IPv6 addresses between network and host ID portions ("Why should we limit the hierarchical network space to 64 bits? Why do we need 64 bits for a host?").

So we shall see if this is possible. Otherwise, it has to be done in TCP.

Identity is a very complex issue. It can mean different things to different people/processes at different times. Think about the identity for an email address, or an SSL cert, or a web page - you can surely think of many different scenarios and distinct issues for each. At the network layer, it is very hard to come up with a universal meaning of identity other than "the location in the network". Identity is an issue that really can't be solved at the network layer, other than equating it with location. Even a shim protocol between TCP and IP can't really say more than "these 2 network locations appear to be controlled by the same entity, around this time".

Anyway... :)

Fundamentally modifications to TCP are necessary

Posted Apr 2, 2013 16:15 UTC (Tue) by giraffedata (guest, #1954) [Link]

An IP address conflates two things that should be separate: location (where) and identity (who). The latter should be in layer 4.

I'm with you on there being a need to separate location and identity, and to do it by layers, but it looks like all part of layer 3 (network layer) to me. One should be able to direct any IP packet to an identity, not just a TCP stream.

That's just speaking of ideals, of course. I'm not saying that's the direction we should be going now.

There is a layering issue between TCP and IP in that the TCP port address shouldn't be in the IP packet header. I wouldn't want to confuse that with this.