Is it possible to detect if the dodgy equipment is causing problems, and set a flag in the kernel to transmit the packets in the correct order? E.g. does the dodgy equipment return some response that Linux can see - e.g. an icmp reject on the connection?
Yes, this is a hack, and I for one hate hacks that permit bad behaviour in other devices at the expense of maintainability and simplicity of the non-offending code. But it may be a better option than turning all TCP timestamps off or reverting the kernel.
It might also provide a way to alert users that their networking hardware needs updating, which solves the problem in a more permanent way.
Posted Oct 28, 2008 23:43 UTC (Tue) by rfunk (subscriber, #4054)
[Link]
How does that work better than just making the kernel *always* transmit in
the "right" order? Sounds like adding lots of complication for nearly
zero gain.
Networking change causes distribution headaches
Posted Oct 29, 2008 0:24 UTC (Wed) by dlang (✭ supporter ✭, #313)
[Link]
the kernel always did transmit in the "right" order according to the RFCs
the problem is that it has been discovered that there are some routers out there that do not follow the RFCs and only work if things get transmitted in one specific order.
so the kernel has been changed (post 2.6.27) to transmit in the order that this batch of broken routers require.
for bonus points, what should the kernel do if another batch of broken routers is discovered that wants a different order?
Networking change causes distribution headaches
Posted Oct 29, 2008 1:08 UTC (Wed) by jamesh (subscriber, #1159)
[Link]
> for bonus points, what should the kernel do if another batch of
> broken routers is discovered that wants a different order?
Presumably, the current broken routers work with the packets generated by Windows. If a new router expected a different option order it wouldn't work with Windows, which is the kind of problem that would be noticed.
Networking change causes distribution headaches
Posted Oct 29, 2008 1:29 UTC (Wed) by dlang (✭ supporter ✭, #313)
[Link]
so we need to reverse engineer how windows does things and never do anything different, even if the RFC allows it?
with that mindset we can never be better than windows.
yes, it is the case with doggy hardware that sometimes we do end up saying that 'windows does it this way and it works, the hardware doesn't follow the specs so we just need to do it the same way'
but to take that attitude about something that's supposed to be as generic as your network packets can be crippling.
Networking change causes distribution headaches
Posted Oct 29, 2008 2:23 UTC (Wed) by corbet (editor, #1)
[Link]
The sad fact is that "what does Windows do?" is a question that kernel developers often have to keep in mind. Whatever Windows does is what's actually tested; it's often the only thing that works. It's a pain.
Networking change causes distribution headaches
Posted Nov 1, 2008 4:32 UTC (Sat) by jbailey (subscriber, #16890)
[Link]
It's not so much a matter or never so much as knowingly. Linux doing ECN
managed to make all sorts of devices on the Internet not cope with Linux,
so it had to be disabled in order to work. But it's still there and an
option. This thing isn't going to matter one way or the other, so it may
as well be done as Windows does it to avoid any hassle.
Tks,
Jeff Bailey
Networking change causes distribution headaches
Posted Oct 29, 2008 3:48 UTC (Wed) by gdt (subscriber, #6284)
[Link]
In the real world kernels deal with equipment which incorrectly implements specifications all of the time: ranging from hard disks to TCP. TCP itself has one option (the urgent pointer) who's current interpretation differs from the original specification due to an implementation error in early BSD.
This issue is hardly the first home router or firewall issue encountered: some break on ECN, some break on SACK, some incorrectly handle large window scale values. Some of those home routers with bugs run Linux.
It is disappointing that Ubuntu chose to limit the performance of TCP rather than ship a patched kernel.
Networking change causes distribution headaches
Posted Oct 29, 2008 4:02 UTC (Wed) by dlang (✭ supporter ✭, #313)
[Link]
are tey limiting the performance of TCP?
I've seen many cases where doing the time calls in the TCP stack becomes the limiting factor, so disabling this should speed up TCP, it limits the features, but not the performance
Networking change causes distribution headaches
Posted Oct 29, 2008 14:52 UTC (Wed) by drag (subscriber, #31333)
[Link]
Well if your Ubuntu system is failing to, you know, contact the update server to download a fixed kernel because TCP is being blocked by a broken router; when every older version of Ubuntu can do it just fine, and every other OS does it just fine... then ya that's a dramatic reduction in performance.
-----------------------------------------
I can't beleive Ubuntu people are so closed minded that they can't understand that if you can't get out on the internet to download a fixed kernel, then your screwed. Your only option, as a end user, is to download the kernel fix post-installation. But if you can't contact it because your kernel is triggering a common TCP implimentation bug.. then your SOL.
There is a similar issue with DNS brokenness with Linux in general. As in; Linux behaving correctly, but getting bad results because a ISP can't get their shit straight or you have a buggy DNS proxy in some SOHO router. This is pretty common and it prevents end users from being able to reliably use some websites, which otherwise works perfectly well in any other OS. (the fix is usually to install a local DNS caching service like dnsmasq on the system)
Your bugs / my problem
Posted Oct 29, 2008 18:57 UTC (Wed) by tialaramex (subscriber, #21167)
[Link]
This sort of brokenness is universal. Software has bugs. Sometimes the other guy's software has bugs, but you have to pay the price. So long as we don't have some evidence that the bugs were a result of malice, there is nothing much to do except name & shame, and then suck it up.
Prior examples include: DNS servers that silently ignore AAAA requests instead of replying that there's no matching record, causing a timeout for users who merely /enquired/ if they could use IPv6. IP "firewalls" that drop every type of ICMP packet indiscriminately by default. HTTP servers that silently accept pipelined requests, but don't pipeline the answers - so it answers all your HTTP queries, but the results are arbitrarily muddled together. Home routers that silently modify any 4 byte sequence resembling your private IP address to the 4 bytes representing the masqueraded public address? Yes, those really exist. Sometimes it seems like it'd be better to flush it away and start over - but don't make that mistake, we'd make just as many errors next time.
Although they seem to be the worst offenders, the proprietary systems aren't the only ones making these goofs. Samba's buggy attempt at early implementation of a new Windows SMB feature meant that not only could you not use the feature with Samba, but Microsoft had to disable it for Windows clients too, so everyone lost.
And let's not dwell on Debian's OpenSSL goof. To achieve a reasonable expectation of security everyone's SSL implementations should be updated to regard all the affected keys as weak, and reject them outright - but doing that means a permanent increase in the overhead of using SSL forever and for everyone in the whole world. Ouch.
Networking change causes __REGRESSION__
Posted Oct 29, 2008 1:29 UTC (Wed) by brianomahoney (subscriber, #6206)
[Link]
I, for one am getting _very_fed-up_ with people who dont seem to
understand that breaking something that was working is a very bad thing. I
completely agree with Linus that these REGRESSIONS are to be avoided. and
fixed ASAP.
At the least they are very irritating and usually time-consuming, and
there are all too many these days, in the kernel eg this, e1000e and
userland eg latest Firefox/Seamonkey breaking non CUPS printing.
While it is true that newbies and should not be using alpha, beta stuff it
is true that fewer and fewer corner-cases are being tested before shipping
the newist-latest ... ops!
Networking change causes __REGRESSION__
Posted Oct 29, 2008 4:02 UTC (Wed) by dlang (✭ supporter ✭, #313)
[Link]
who are you upset at?
the kernel developers did fix it quickly after it was reported.
it's impossible to test against all hardware as there is nobody in the world that has one of everything to test against (especially when you consider that firmware updates can radicaly change the behavior as well)
Networking change causes distribution headaches
Posted Oct 29, 2008 2:51 UTC (Wed) by PaulWay (✭ supporter ✭, #45600)
[Link]
Because the intention is not to wallpaper over the mistake and forget about it, the intention is to alert the user that they have non-compliant hardware on their network and they should upgrade.
Because Linux is not an operating system that says "well, it sort of kind of works, that's good enough, why change it?" to decisions like this. Reverting back to the previous behaviour is good to fix the problem short-term, but a long-term solution needs to be developed.
IMO patching device drivers and kernels to make them work with hardware in the machine is (vaguely) acceptable; the further the device is from the machine, the more it's not the kernel's responsibility.
Networking change causes distribution headaches
Posted Oct 29, 2008 4:31 UTC (Wed) by jamesh (subscriber, #1159)
[Link]
In this particular case, there are multiple ways to structure a packet that are considered equally valid according to the RFC and have the same code complexity.
One of the options happens to avoid a bug in certain hardware, probably due to matching the behaviour of a certain competing operating system. Why on earth wouldn't you choose that option?
Your suggestion would result in more complex code that has the potential to be slower and more buggy.
Networking change causes distribution headaches
Posted Oct 30, 2008 7:00 UTC (Thu) by grahammm (guest, #773)
[Link]
So what happens when (as is sure to happen some time) option A is needed to avoid a bug in one particular hardware and option B to avoid a bug in a different hardware?
Networking change causes distribution headaches
Posted Oct 30, 2008 14:44 UTC (Thu) by mrshiny (subscriber, #4266)
[Link]
Worry about that when it happens. Until then, zero-cost workarounds that prevent loss of functionality are more desirable than some sort of notion of purity.
Networking change causes distribution headaches
Posted Oct 29, 2008 12:59 UTC (Wed) by epa (subscriber, #39769)
[Link]
the intention is to alert the user that they have non-compliant hardware on their network and they should upgrade.
Yes, that's exactly what the intention is. Clearly, what the users want most of all is not to get their work done, but to receive useful and informative messages about hardware purchases they need to make in order to remain fully standards-compliant. Imagine a new user's heartfelt shame on first installing Linux and finding out they had been running a router that didn't strictly follow the RFCs, soon turning to joy and gratitude that Linux had revealed their sins and given them an opportunity to buy a replacement, helping to financially support honest manufacturers who test their products with all the world's wide diversity of operating systems.
Compared to these noble goals, it would be baseness and narrow-mindedness indeed for anyone to complain that Linux "doesn't work" or does not let them access networks that seemingly worked with Microsoft Windows. Indeed, we should surely add more of these features to the kernel, righteously refusing to work with any hardware or program that doesn't correctly implement standards, to lead us further towards the goal of a world where all computers work harmoniously together. Let Linux lead the way!
(Excuse the excess of sarcasm, I'm really missing the Linux Hater's Blog since he stopped posting.)
Networking change causes distribution headaches
Posted Oct 29, 2008 18:49 UTC (Wed) by ncm (subscriber, #165)
[Link]
Imagine, further, the joy in the detective work to identify and locate the owner of each intermediate router discarding one's packets, and the further joy of warm human contact achieved while persuading said owners to upgrade their equipment, and said owners' joy in locating and installing upgrades, and in finally having compliant equipment.
Such an outpouring of joy could not but uplift Ubuntu's standing in the world.