Networking change causes distribution headaches
A seemingly innocuous change to the networking code that went into the 2.6.27 kernel is now causing trouble for various distributions. Ubuntu, Fedora, and openSUSE are all buttoning up their packages for a release in the near future—with Ubuntu's due this week—so kernel changes are not particularly welcome. Unfortunately, if the problem is not addressed, some users may never be able to download a fix because their TCP/IP won't interoperate with some broken equipment on the internet.
The problem stems from changes that were made to clean up the TCP option code that were merged back in July as part of the 2.6.27 merge window. TCP options are a mechanism to expand the functionality of the protocol as conditions change. There are a handful of commonly used options that the two endpoints of a connection can agree to use, for things like maximum segment size (MSS), window scaling, selective acknowledgment (SACK), and timestamps. Options have been added over time to provide more internet robustness and performance as well as to support higher-bandwidth physical connections.
A perfectly reasonable, if unintended, consequence of the code change was that the the options were put into the header in a slightly different order. According to the relevant RFCs, options can appear in any order in the option section of the TCP header. But, some home and/or internet routers seem to expect a fixed order; refusing to make connections if the order is "wrong". In particular, it would seem that the MSS option needs to appear before the SACK option.
The bug was reported to Ubuntu Launchpad in early September, but not a lot of progress was made until it was added to the kernel.org bugzilla in early October. It seems to have only affected a relatively small number of users—Red Hat's Dave Jones said that there were no reports from users of the rawhide 2.6.27 kernel—as it was rather hardware-specific. This made it difficult to track down for the majority of folks who couldn't reproduce it. Ubuntu user Aldo Maggi, who filed the kernel bug, sets a marvelous example of how to work with the kernel hackers to track down the problem as can be seen in the bugzilla entry.
Eventually, the option re-ordering problem was discovered and a patch was submitted by Ilpo Järvinen that restored the order of the options. Along the way, with help from Mandriva, it was discovered that turning off TCP timestamps by way of:
sysctl -w net.ipv4.tcp_timestamps=0worked around the problem without changing the kernel—at the cost of losing the TCP timestamp functionality.
So it would seem that the problem has been solved—the patch has been merged into Linus Torvalds's tree for 2.6.28—but there are still a few unresolved issues. The three distributions that are preparing new releases are all based on 2.6.27, but as yet, there has not been a -stable kernel release that picks up the patch, though it is likely to come fairly soon.
In the meantime, Fedora has added the patch to its kernel in rawhide, so
Fedora 10 (and eventually Fedora 9 when it gets rebased on 2.6.27) will
have the fix. openSUSE is waiting a bit to see what gets submitted by the
kernel networking developers to the
-stable team. As Novell/SUSE kernel hacker Greg Kroah-Hartman puts it:
"We still have a while to go before the final 11.1
kernel is released, so we feel no pressure here.
" Unfortunately,
Ubuntu got caught very late in its release cycle as 8.10 (or Intrepid Ibex)
is due on October 30.
The original plan as outlined by Debian/Ubuntu hacker Steve Langasek was to note the problem in the release notes for 8.10, but not address the underlying problem until after the release:
That led many in the Launchpad bug thread to note that it was going to be a real mess, especially for the least technical of users. Nick Lowe sums up the problem:
RC shouldn't mean Release ComeHellOrHighWater
The users who are most likely to hit this are home users behind their aged/unmaintained consumer routers who are highly unlikely to understand why they can't access the Web and will just go elsewhere...
Certainly, the release notes are not the first place an affected user would go if they ran into the problem. More than likely, they would just decide that Ubuntu—by extension Linux—is simply broken, so it is a relief to see that Ubuntu eventually relented. For 8.10, the procps package has been changed to work around the problem by turning off timestamps. Once a new kernel package is released with the re-ordering patch included, timestamps can presumably be restored.
This kind of problem—where affected users may not be able to retrieve an update to fix it—should really be part of the definition of a show-stopping (i.e. release date slipping) problem. It was rather galling to some that Ubuntu would consider shipping with this known issue, simply to make its 8.10 release in the 10th month of 2008 (which is how Ubuntu releases are numbered).
Ubuntu is justifiably proud of its record of shipping releases on time, but it cannot do that at the expense of its users. While the workaround that was implemented was suboptimal, perhaps, it does ensure that users—especially non-technical users—won't find that web surfing doesn't work in Linux. It should also allow Ubuntu to release on schedule.
[ Thanks to Nick Lowe for giving us a heads-up about this issue. ]
Posted Oct 28, 2008 20:48 UTC (Tue)
by pj (subscriber, #4506)
[Link] (11 responses)
Posted Oct 28, 2008 21:22 UTC (Tue)
by ca9mbu (guest, #11098)
[Link] (10 responses)
Yes, it sucks that this had the potential to impact Ubuntu's release schedule, but I guess that's the price one pays for time-based releases (which I'm all in favour of). I don't agree with Ubuntu's decision to workaround the issue via a procps update as opposed to a kernel update just to avoid a release slippage, but then I'm not the RM (or even involved with Ubuntu in any capacity).
Regards,
Matt.
Posted Oct 29, 2008 1:07 UTC (Wed)
by jordip (guest, #47356)
[Link] (9 responses)
On the other hand 10/01 will make Ubuntu release farther in time from Fedora and Opensuse ...
Posted Oct 29, 2008 5:27 UTC (Wed)
by ncm (guest, #165)
[Link] (6 responses)
Posted Oct 29, 2008 8:24 UTC (Wed)
by rvfh (guest, #31018)
[Link] (3 responses)
Posted Oct 29, 2008 11:25 UTC (Wed)
by mjg59 (subscriber, #23239)
[Link] (2 responses)
Posted Oct 29, 2008 16:30 UTC (Wed)
by jzbiciak (guest, #5246)
[Link] (1 responses)
Seriously, why can't procps look at /proc/version and presume that 2.6.27 is broken, but any other version (including 2.6.27.1). As long as it's looking very specifically for the broken version's version string there, it should work ok.
Sure, if someone installs a broken kernel with a different string, then the workaround won't kick in, but I don't really see a problem with that. If you're installing your own kernel rather than sticking to vendor kernels, then you're signing up to own a bit more of the problem yourself, don't ya think?
Posted Oct 29, 2008 18:21 UTC (Wed)
by jzbiciak (guest, #5246)
[Link]
Seriously, why can't procps look at /proc/version and presume that 2.6.27 is broken, but not any other version (including 2.6.27.1) oops. :-) Fixed that.
Posted Oct 29, 2008 12:40 UTC (Wed)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted Oct 29, 2008 18:42 UTC (Wed)
by ncm (guest, #165)
[Link]
Posted Oct 30, 2008 1:05 UTC (Thu)
by sbergman27 (guest, #10767)
[Link] (1 responses)
Ubuntu has traditionally been more conservative. But in a "practice what we preach" action, they sync'd up with the other major distros which were planning to use 2.6.27 to help synchronize problem finding and debugging focus. Personally, the way kernel development is done these days I think the distros need to lag kernel releases a bit more. The fall releases, with the exception of Fedora, should really have targeted 2.6.26. I'm not criticizing the current kernel development process (though I gravely note Andrew's ongoing quality concerns), but the 2.6 dev process means that the distros are responsible for that much more of the QA. And that can't be done in a hurry. This particular issue doesn't seem too severe. But a month before general availability, going gold, or whatever you want to call it, the included kernel shouldn't be physically destroying beta testers' hardware or otherwise exhibiting behavior of baby-eating magnitude.
Posted Oct 30, 2008 13:50 UTC (Thu)
by filipjoelsson (guest, #2622)
[Link]
In the earlier series of kernels, the vendor patchsets were much larger - and contained everything from drivers and filesystems to security fixes. I would argue that there was less QA before kernel release in the 2.0, 2.2 and 2.4 series (where the QA was, it boots Linus'/Alan's/Marcello's computer). Ok, so now there is a much bigger difference between each point version than it was then, but still - the vendors cooperate in the same kernel tree to a much larger extent, and there actually _is_ QA now. Anyone remember versions 2.2.0, and 2.2.1 (what were they, one or two days apart)? Care to have a chat about kernel versions 2.4.0 to 2.4.13?
What we have now is tremendously better tested than the old and ancient series.
Posted Oct 28, 2008 21:55 UTC (Tue)
by rfunk (subscriber, #4054)
[Link]
Posted Oct 28, 2008 23:34 UTC (Tue)
by PaulWay (guest, #45600)
[Link] (18 responses)
Yes, this is a hack, and I for one hate hacks that permit bad behaviour in other devices at the expense of maintainability and simplicity of the non-offending code. But it may be a better option than turning all TCP timestamps off or reverting the kernel.
It might also provide a way to alert users that their networking hardware needs updating, which solves the problem in a more permanent way.
Posted Oct 28, 2008 23:43 UTC (Tue)
by rfunk (subscriber, #4054)
[Link] (17 responses)
Posted Oct 29, 2008 0:24 UTC (Wed)
by dlang (guest, #313)
[Link] (8 responses)
the problem is that it has been discovered that there are some routers out there that do not follow the RFCs and only work if things get transmitted in one specific order.
so the kernel has been changed (post 2.6.27) to transmit in the order that this batch of broken routers require.
for bonus points, what should the kernel do if another batch of broken routers is discovered that wants a different order?
Posted Oct 29, 2008 1:08 UTC (Wed)
by jamesh (guest, #1159)
[Link] (3 responses)
Presumably, the current broken routers work with the packets generated by Windows. If a new router expected a different option order it wouldn't work with Windows, which is the kind of problem that would be noticed.
Posted Oct 29, 2008 1:29 UTC (Wed)
by dlang (guest, #313)
[Link] (2 responses)
with that mindset we can never be better than windows.
yes, it is the case with doggy hardware that sometimes we do end up saying that 'windows does it this way and it works, the hardware doesn't follow the specs so we just need to do it the same way'
but to take that attitude about something that's supposed to be as generic as your network packets can be crippling.
Posted Oct 29, 2008 2:23 UTC (Wed)
by corbet (editor, #1)
[Link]
Posted Nov 1, 2008 4:32 UTC (Sat)
by jbailey (guest, #16890)
[Link]
Tks,
Posted Oct 29, 2008 3:48 UTC (Wed)
by gdt (subscriber, #6284)
[Link] (3 responses)
This issue is hardly the first home router or firewall issue encountered: some break on ECN, some break on SACK, some incorrectly handle large window scale values. Some of those home routers with bugs run Linux.
It is disappointing that Ubuntu chose to limit the performance of TCP rather than ship a patched kernel.
Posted Oct 29, 2008 4:02 UTC (Wed)
by dlang (guest, #313)
[Link] (2 responses)
I've seen many cases where doing the time calls in the TCP stack becomes the limiting factor, so disabling this should speed up TCP, it limits the features, but not the performance
Posted Oct 29, 2008 14:52 UTC (Wed)
by drag (guest, #31333)
[Link] (1 responses)
-----------------------------------------
I can't beleive Ubuntu people are so closed minded that they can't understand that if you can't get out on the internet to download a fixed kernel, then your screwed. Your only option, as a end user, is to download the kernel fix post-installation. But if you can't contact it because your kernel is triggering a common TCP implimentation bug.. then your SOL.
There is a similar issue with DNS brokenness with Linux in general. As in; Linux behaving correctly, but getting bad results because a ISP can't get their shit straight or you have a buggy DNS proxy in some SOHO router. This is pretty common and it prevents end users from being able to reliably use some websites, which otherwise works perfectly well in any other OS. (the fix is usually to install a local DNS caching service like dnsmasq on the system)
Posted Oct 29, 2008 18:57 UTC (Wed)
by tialaramex (subscriber, #21167)
[Link]
Prior examples include: DNS servers that silently ignore AAAA requests instead of replying that there's no matching record, causing a timeout for users who merely /enquired/ if they could use IPv6. IP "firewalls" that drop every type of ICMP packet indiscriminately by default. HTTP servers that silently accept pipelined requests, but don't pipeline the answers - so it answers all your HTTP queries, but the results are arbitrarily muddled together. Home routers that silently modify any 4 byte sequence resembling your private IP address to the 4 bytes representing the masqueraded public address? Yes, those really exist. Sometimes it seems like it'd be better to flush it away and start over - but don't make that mistake, we'd make just as many errors next time.
Although they seem to be the worst offenders, the proprietary systems aren't the only ones making these goofs. Samba's buggy attempt at early implementation of a new Windows SMB feature meant that not only could you not use the feature with Samba, but Microsoft had to disable it for Windows clients too, so everyone lost.
And let's not dwell on Debian's OpenSSL goof. To achieve a reasonable expectation of security everyone's SSL implementations should be updated to regard all the affected keys as weak, and reject them outright - but doing that means a permanent increase in the overhead of using SSL forever and for everyone in the whole world. Ouch.
Posted Oct 29, 2008 1:29 UTC (Wed)
by brianomahoney (guest, #6206)
[Link] (1 responses)
At the least they are very irritating and usually time-consuming, and
While it is true that newbies and should not be using alpha, beta stuff it
Posted Oct 29, 2008 4:02 UTC (Wed)
by dlang (guest, #313)
[Link]
the kernel developers did fix it quickly after it was reported.
it's impossible to test against all hardware as there is nobody in the world that has one of everything to test against (especially when you consider that firmware updates can radicaly change the behavior as well)
Posted Oct 29, 2008 2:51 UTC (Wed)
by PaulWay (guest, #45600)
[Link] (5 responses)
Because Linux is not an operating system that says "well, it sort of kind of works, that's good enough, why change it?" to decisions like this. Reverting back to the previous behaviour is good to fix the problem short-term, but a long-term solution needs to be developed.
IMO patching device drivers and kernels to make them work with hardware in the machine is (vaguely) acceptable; the further the device is from the machine, the more it's not the kernel's responsibility.
Posted Oct 29, 2008 4:31 UTC (Wed)
by jamesh (guest, #1159)
[Link] (2 responses)
One of the options happens to avoid a bug in certain hardware, probably due to matching the behaviour of a certain competing operating system. Why on earth wouldn't you choose that option?
Your suggestion would result in more complex code that has the potential to be slower and more buggy.
Posted Oct 30, 2008 7:00 UTC (Thu)
by grahammm (guest, #773)
[Link] (1 responses)
Posted Oct 30, 2008 14:44 UTC (Thu)
by mrshiny (guest, #4266)
[Link]
Posted Oct 29, 2008 12:59 UTC (Wed)
by epa (subscriber, #39769)
[Link] (1 responses)
Compared to these noble goals, it would be baseness and narrow-mindedness indeed for anyone to complain that Linux "doesn't work" or does not let them access networks that seemingly worked with Microsoft Windows. Indeed, we should surely add more of these features to the kernel, righteously refusing to work with any hardware or program that doesn't correctly implement standards, to lead us further towards the goal of a world where all computers work harmoniously together. Let Linux lead the way!
(Excuse the excess of sarcasm, I'm really missing the Linux Hater's Blog since he stopped posting.)
Posted Oct 29, 2008 18:49 UTC (Wed)
by ncm (guest, #165)
[Link]
Such an outpouring of joy could not but uplift Ubuntu's standing in the world.
Posted Oct 29, 2008 6:16 UTC (Wed)
by davem (guest, #4154)
[Link] (9 responses)
Turning TCP timestamps off has severe consequences, not as
It has security implications in fact, it makes the range of
It's a really crass move on Ubuntu's part to be so asinine about
Posted Oct 29, 2008 7:19 UTC (Wed)
by jspaleta (subscriber, #50639)
[Link] (7 responses)
For all the crap I give Canonical for other decisions, I'm not going to beat them up over a time-sensitive judgement call concerning a technical regression in the 11th hour 57th minute of their release cycle. I would not wish this sort of situation on distributor with a deadline to meet. They were pressed, they made a judgement call, a judgement call which hopefully insures that all installs have working network connectivity so all users can install updates as soon as the install is complete.
Distribution release processes...are painful. I think of it as akin to how child birth is depicted in older movies. Everytime the Fedora release team is in their final week during a release I feel like I need to be boiling water like you see anxious fathers being told to do by midwives... or something equally futile to stay out of releng's way (lwn posting might count). The release freeze process itself always causes a window of delay where security fixes can crop up that can't be included in the composed "release" tree without scrapping the whole compose process and starting over.
For whatever security implications the chosen quickfix has for Ubuntu users, hopefully Ubuntu will be able to put out a release day update to all users of 8.10 that addresses the issue which fixes the issue properly.
It's moot anyways, most people should be boycotting self-installing 8.10 at release anyways and purchasing it as part of a shiny new Dell pre-install to bolster pre-installed linux OEM demand statistics for this fiscal quarter. Dell will apply available updates for you as part of the pre-install.
-jef
Posted Oct 29, 2008 7:45 UTC (Wed)
by Cato (guest, #7643)
[Link] (2 responses)
Most people should be running 8.04 not 8.10 in any case - as the first release after an LTS release, 8.10 is going to be more risky (hence the 'Intrepid', just like the 'Edgy' for 6.10).
I don't normally run a new Ubuntu release in production until a month or two has elapsed in any case.
Posted Oct 29, 2008 14:56 UTC (Wed)
by drag (guest, #31333)
[Link] (1 responses)
In order to download the "fixed" kernel you need to be able to get on the Internet. If you can't get on the internet due to a "broken" kernel then how exactly are you suppose to solve your problem?
Catch-22.
The _only_ effective fix for Ubuntu, at this point, is to include the "fixed" kernel with the installer.
And I can understand if they can't pull it off. But it's going to suck for some people that they can't.
(Of course, I am fully aware that it's not Linux being broken, just the environment that Linux is expected to operate in has buggy network hardware sometimes)
Posted Oct 29, 2008 15:00 UTC (Wed)
by drag (guest, #31333)
[Link]
that's what I get for not reading it to the end. They are a lot smarter then me, after all.
Posted Oct 29, 2008 15:18 UTC (Wed)
by TRS-80 (guest, #1804)
[Link]
Posted Oct 29, 2008 15:40 UTC (Wed)
by nevyn (guest, #33129)
[Link] (1 responses)
Understandably you're thinking of rpm here and not dpkg, because dpkg has no was to do "installonly" type packages the kernel has the version in the name ... thus. there's no good way to say in procps "Requires: kernel >= 2.6.27-2". They might hack it by having a dep. from the fixed kernel on the newer procps, or they might release a procps later and assume noone will install that and use the GA kernel ... but they might just leave timestamps off for 8.10.
Personally it seems like they made a poor choice, but as you point out there are other more fundamental problems ... so this one is not high on the list, IMO.
Posted Oct 29, 2008 17:06 UTC (Wed)
by jspaleta (subscriber, #50639)
[Link]
I can only assume that the Ubuntu release team thought this through and have the ability to push an update out that reverts the quick fix when a proper fix is available and tested.
If there are security implications for turning timestamping off, then intrepid Intrepid users should probably impress on the Ubuntu devs the importance of turning timestamping on as an update as soon as possible to limit exposure...in the appropriate Ubuntu communications channel.
-jef
Posted Oct 29, 2008 23:43 UTC (Wed)
by xoddam (subscriber, #2322)
[Link]
Well yes. I did *ask* for it, but Dell for some reason still doesn't sell them in a whole heap of countries, so we simply chose a machine on which Ubuntu is sold pre-installed elsewhere, and installed it ourselves. I'd have chosen the version with Intel graphics (fully supported in free software), but that too isn't available here, so I have a fancy evil nvidia GPU, which luckily is documented to work fine with free-software drivers. And so it did, until I tried to resume after suspend-to-ram. No display.
How fortunate that Ubuntu makes it easy for me to "give up my freedom" and switch to the nasty source-free driver from the GPU maker. I intend to give some time to sorting out the suspend/resume problem with the nv developers (and/or the ubuntu xorg maintainers), but since it means a reboot every time it doesn't work, it will be a very time-consuming process and I couldn't use the machine for its intended purpose meanwhile. Which would annoy my employer, who paid for it.
Posted Oct 30, 2008 18:23 UTC (Thu)
by busterb (guest, #560)
[Link]
Setting up procps (1:3.2.7-9ubuntu2.1) ...
Setting up linux-headers-2.6.27-7 (2.6.27-7.15) ...
Posted Oct 29, 2008 15:22 UTC (Wed)
by AJWM (guest, #15888)
[Link] (5 responses)
If the hardware doesn't conform to spec in this instance, who knows what other traps lie lurking in the defective hardware implementation?
Posted Oct 29, 2008 16:56 UTC (Wed)
by jspaleta (subscriber, #50639)
[Link] (4 responses)
That's the biggest problem with this issue, we don't know how widespread it is.
If there was a way to test for brokenness without having to have users boot into an affected kernel.. something we can have them run as a quick test app. I'd be more than happy to take my rhetorical skills to the Fedora userbase and encourage to test their network gear for brokenness and report back so we can get a better handle which gear manufacturers we need to talk to about firmware updates.
-jef
Posted Oct 29, 2008 17:10 UTC (Wed)
by jake (editor, #205)
[Link] (3 responses)
Hmm, I thought another dimension of the problem was that it is not clear that it is only home routers that are problematic. If there is gear installed at the ISPs that is affected by this problem, it doesn't much matter what gear we buy. Which is not to say that it would not be worth knowing, just that no matter how much testing is done and how many new home routers are bought, we may still be routed through bad hardware.
jake
Posted Oct 29, 2008 18:51 UTC (Wed)
by ncm (guest, #165)
[Link]
Posted Oct 30, 2008 15:27 UTC (Thu)
by AJWM (guest, #15888)
[Link] (1 responses)
(Minor rant mode: I don't know if it's just me noticing it more, or the problem is getting worse, but lately I'm seeing a lot of messages (posts and emails) complaining about problems without providing any specifics that would allow me (or someone) to investigate/fix the problem. Maybe it's the run-up to the election: all these zero-real-content political messages are causing widespread brain damage. Minor rant mode off.)
Posted Oct 30, 2008 15:59 UTC (Thu)
by jake (editor, #205)
[Link]
Which would be great of course. It is just not clear to me how Linux users who are experiencing problems with their TCP connectivity will be able to even determine what hardware is causing the problem. They may be able to switch to a known-good home router, but if they still have the problem, it is not obvious (at least to me) how to diagnose it from there. ISPs, at least in my experience, are not very interested in discussing their networking gear with their customers. Alerting them to the problem might help, at least in some cases, but it really isn't ever going to allow Linux to put the options in any arbitrary order.
jake
Posted Oct 29, 2008 20:29 UTC (Wed)
by davem (guest, #4154)
[Link] (13 responses)
If you turn timestamps off, at rates of 1GB/s and above you
Ubuntu made the wrong decision, there is simply no argument for
I don't understand why everyone gets their tits in a knot when
To reiterate, if timestamps are off, you are exposed to possible
Posted Oct 29, 2008 22:04 UTC (Wed)
by jspaleta (subscriber, #50639)
[Link] (11 responses)
Has this been communicated into the Ubuntu bug tracker?
-jef
Posted Oct 29, 2008 22:17 UTC (Wed)
by nick.lowe (guest, #54609)
[Link] (10 responses)
Posted Oct 29, 2008 22:32 UTC (Wed)
by jspaleta (subscriber, #50639)
[Link] (9 responses)
-jef
Posted Oct 29, 2008 22:54 UTC (Wed)
by nick.lowe (guest, #54609)
[Link] (8 responses)
No, not at all. :)
It is a seperate issue.
The workaround -introduces- a data corruption problem at high data rates because it disables PAWS protection in the TCP/IP stack by virtue of the timestamps no longer being there.
The issue here is that the server release will go out with this, which will be run on machines highly likely to see these data rates!
Posted Oct 30, 2008 3:02 UTC (Thu)
by njs (subscriber, #40338)
[Link] (7 responses)
Obviously this whole situation is unfortunate, but... is your suggestion really that there are large numbers of people with GB/s equipment who are likely to jump to a non-LTS Ubuntu release, on the first day, and don't read release notes, and don't install updates? Because that seems like a relatively narrow slice of the userbase to me -- not so narrow it should be ignored, so I'm glad you're continuing to help the ubuntu devs keep on top of things, but narrow enough that some of the other folks in this thread could maybe stand to relax a bit...
Posted Oct 30, 2008 3:40 UTC (Thu)
by nick.lowe (guest, #54609)
[Link]
Posted Oct 30, 2008 6:27 UTC (Thu)
by davem (guest, #4154)
[Link] (1 responses)
I just mentioned the data corrupter just to show how absolutely
Want to know the litmus test of how stupid this is? Not one
That's the definition of failure.
Posted Oct 30, 2008 16:08 UTC (Thu)
by hppnq (guest, #14462)
[Link]
Here's a simple explanation for Ubuntu's decision.
As a side note: for home users -- who are extremely unlikely to be running
at high enough data rates -- there is (also) the option to revert to the
last working kernel. Maybe in a next release, this specific kind of
distribution problem will actually be "solved" by Ubuntu. Which would be
very nice.
Second side note: a couple of years ago PAWS users were vulnerable -- on a
rather big scale -- to a remote Dos. Your mileage will always vary.
Posted Oct 30, 2008 9:49 UTC (Thu)
by ncm (guest, #165)
[Link] (3 responses)
Posted Oct 30, 2008 9:56 UTC (Thu)
by njs (subscriber, #40338)
[Link]
Sure, maybe. Is your objection literally that you don't know how they're going to phase it out, or that you're worried that in fact they won't phase it out? I don't know how they're planning to do it (though the suggestion somewhere upthread of checking the runtime version of the kernel sounded plausible to me, and they can just drop it altogether in 9.04 in any case), but I'm pretty confident that they don't want to carry this annoyingness around and will find some way to get rid of it, and the exact mechanism they choose doesn't affect me, so I don't really care what it is.
Posted Oct 31, 2008 1:43 UTC (Fri)
by jamesh (guest, #1159)
[Link] (1 responses)
Posted Oct 31, 2008 2:11 UTC (Fri)
by ncm (guest, #165)
[Link]
At what point will the download CD/DVD images get the updates?
Posted Oct 31, 2008 18:50 UTC (Fri)
by Cato (guest, #7643)
[Link]
IMHO, anyone who is not running Ubuntu on a supercomputer using that type of network connection can safely ignore the chance of the PAWS issue corrupting their data.
Even if you meant 1 Gbps, that's an impressive sustained rate for a high latency network session (i.e. the ones where sequence numbers matter, hence over a WAN). A Gigabit Ethernet LAN would almost by definition have low latencies (a few milliseconds).
I'm generalising here, but I really think the PAWS issue is irrelevant to people who are likely to use Ubuntu. If it was an HPC distro it would be quite different.
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
The problem here is that Ubuntu released too close to a kernel release. It was a known issue but 2.6.27 have enough benefits to weight that.
I think future releases of Ubuntu will be more conservative on this regard.
Also, a faster Ubuntu bugtracker -> kernel bugtracker interaction may have save the day. This is an old discussion now...
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
The problem here is that Ubuntu released too close to a kernel release.
"""
Networking change causes distribution headaches
Cool, the "Cruel To
Be Kind" guy runs Linux! ;-)
"Thanks to Nick Lowe"
(What? What's so
funny?)
Networking change causes distribution headaches
Networking change causes distribution headaches
the "right" order? Sounds like adding lots of complication for nearly
zero gain.
Networking change causes distribution headaches
Networking change causes distribution headaches
> broken routers is discovered that wants a different order?
Networking change causes distribution headaches
The sad fact is that "what does Windows do?" is a question that kernel developers often have to keep in mind. Whatever Windows does is what's actually tested; it's often the only thing that works. It's a pain.
Networking change causes distribution headaches
Networking change causes distribution headaches
managed to make all sorts of devices on the Internet not cope with Linux,
so it had to be disabled in order to work. But it's still there and an
option. This thing isn't going to matter one way or the other, so it may
as well be done as Windows does it to avoid any hassle.
Jeff Bailey
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Your bugs / my problem
Networking change causes __REGRESSION__
understand that breaking something that was working is a very bad thing. I
completely agree with Linus that these REGRESSIONS are to be avoided. and
fixed ASAP.
there are all too many these days, in the kernel eg this, e1000e and
userland eg latest Firefox/Seamonkey breaking non CUPS printing.
is true that fewer and fewer corner-cases are being tested before shipping
the newist-latest ... ops!
Networking change causes __REGRESSION__
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
the intention is to alert the user that they have non-compliant hardware on their network and they should upgrade.
Yes, that's exactly what the intention is. Clearly, what the users want most of all is not to get their work done, but to receive useful and informative messages about hardware purchases they need to make in order to remain fully standards-compliant. Imagine a new user's heartfelt shame on first installing Linux and finding out they had been running a router that didn't strictly follow the RFCs, soon turning to joy and gratitude that Linux had revealed their sins and given them an opportunity to buy a replacement, helping to financially support honest manufacturers who test their products with all the world's wide diversity of operating systems.
Networking change causes distribution headaches
Networking change causes distribution headaches
severe as the lack of connectivity this is trying to work
around, but pretty severe.
what an attacker has to guess to forge packets into your
TCP stream MUCH smaller, for one thing.
making kernel changes, even obvious critical ones like this
option fix, at a late stage. I've run into this problem with them
in the past when they supported sparc64, and this analness wrt.
last minute kernel changes hurts them a lot.
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
I disagree - the reason they had to resort to a workaround instead of applying the actual patch was because there wasn't enough time in their schedule to rebuild the kernel and installer between RC and release. In other words, the Ubuntu schedule made it impossible to fix any show-stopping kernel issue directly if found once the RC is built, which is clearly an avoidable problem.
Networking change causes distribution headaches
Networking change causes distribution headaches
For whatever security implications the chosen quickfix has for Ubuntu users, hopefully Ubuntu will be able to put out a release day update to all users of 8.10 that addresses the issue which fixes the issue properly.
Networking change causes distribution headaches
I'm not going to falsely stand myself up as a network security expert and make a judgement on the validity of the security concern. Even if the security implications are a valid concern, I think its reasonable for Ubuntu to use the option of having a release day update available instead of having to restart their release process to incorporate the upstream fix. As long as a release day update addresses the security implications by turning timestamping back on and integrates the proper kernel patch for the routing regression, the exposure is mitigated to the level of any security issue which requires a post release update.
Networking change causes distribution headaches
Removing obsolete conffile /etc/sysctl.d/10-tcp-timestamps-workaround.conf
* Setting kernel variables (/etc/sysctl.conf)... [ OK
]
* Setting kernel variables (/etc/sysctl.d/10-console-messages.conf)... [ OK
]
* Setting kernel variables (/etc/sysctl.d/10-network-security.conf)... [ OK
]
* Setting kernel variables (/etc/sysctl.d/10-process-security.conf)... [ OK
]
* Setting kernel variables (/etc/sysctl.d/30-tracker.conf)... [ OK
]
Setting up linux-headers-2.6.27-7-generic (2.6.27-7.15) ...
Examining /etc/kernel/header_postinst.d.
So, which hardware is it?
So, which hardware is it?
So, which hardware is it?
>
> That's the biggest problem with this issue, we don't know how widespread it is.
So, which hardware is it?
So, which hardware is it?
So, which hardware is it?
> may care (and have influence) over more than just home-based gear. And those
> of us who work for vendors of such gear can raise their voices towards
> getting it fixed going forward.
Networking change causes distribution headaches
Ubuntu's release that doesn't fix the kernel and instead
turns timestamps off.
are exposed to possible sequence number wraparound. This in
turn can lead to data corruption. Without timestamps there is
no PAWS protection (Protection Against Wrapped Sequence numbers)
and thus at high enough data rates new data can be interpreted
as old data and vice versa, corrupting your data stream.
the way this was "handled."
even the slightest suggestion of slipping a release is suggested
in order to fix a serious bug like one of this magnitude. It is
always the right thing to do, and it avoids crap like what is
happening here.
data corruption.
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
Networking change causes distribution headaches
as I detailed in an earlier comment.
insane this was on just about every level.
damn ubuntu kernel developer asked any of the core networking
folks for guidance on how to handle this problem. They didn't
know the implications, and they didn't bother to ask people
who did.
Well, you would have to be doing a distribution upgrade, boot into it
immediately and go into production without looking for updates. I think it
is fair to assume that not too many people would run into TCP timestamp
related corruption. If they really care about their data,
obviously their scripts would notice the absence of TCP timestamping with
this new release.
Networking change causes distribution headaches
Phasing out the broken-workaround procps
Phasing out the broken-workaround procps
Phasing out the broken-workaround procps
Okay, then.
Phasing out the broken-workaround procps
Networking change causes distribution headaches