User: Password:
|
|
Subscribe / Log in / New account

The newest development model and 2.6.14

The newest development model and 2.6.14

Posted Nov 4, 2005 4:14 UTC (Fri) by mgb (guest, #3226)
In reply to: The newest development model and 2.6.14 by malor
Parent article: The newest development model and 2.6.14

I've been running production systems on Unixen for a quarter century, including Linux for more than a decade. Late 2.4 was definately the high-point.

2.6 may run better than 2.4 on 1024-way clusters but who cares? For every day use 2.6 requires 2 to 4 times the RAM to accomplish the same task as 2.4. A laptop that used to handle a KDE desktop now takes more than half an hour just to boot a shell because of all the bureaucracy which replaced the efficient reliable mknod in /dev.

2.6 kernel quality is not awful, but it's nothing to write home about either. And then we have the ever changing kernel ABI and the kernel crew's antipathy to MadWifi which makes Linux WiFi more trouble than it's worth. It's easier to carry around a LinkSys gateway than hunt down new MadWifi RPM's for PCMCIA cards every time someone on the kernel team sneezes.

Many kernel hackers are now being paid professional salaries. Why then has product discipline dropped to amateur levels?


(Log in to post comments)

The newest development model and 2.6.14

Posted Nov 4, 2005 7:34 UTC (Fri) by emkey (guest, #144) [Link]

My pet peeve is the way new kernels will occasionally change things such that yesterdays eth0 is now eth5. This should NEVER EVER EVER CHANGE! I feel better now. Having to run tcpdump to figure out which interface goes with which network after a kernel update is just plain wrong.

The old model of kernel development had its issues, but it seems like things have swung to a far opposite extreme. Which is a pretty typical over reaction.

I predict another major change sometime in the next two to three years. Hopefully to a system that will stick around for the long haul.

The newest development model and 2.6.14

Posted Nov 4, 2005 15:05 UTC (Fri) by busterb (subscriber, #560) [Link]

There is an easy fix for that; give your interfaces fixed (even
meaningful!) names. see 'man nameif'; it's installed by default with
Debian at least.

The newest development model and 2.6.14

Posted Nov 4, 2005 18:52 UTC (Fri) by yokem_55 (subscriber, #10498) [Link]

This sounds like a distro problem as opposed to a kernel problem. I've never seen and ethx interface change its name simply from updating the kernel that's running the device....

The newest development model and 2.6.14

Posted Nov 4, 2005 19:13 UTC (Fri) by emkey (guest, #144) [Link]

How many systems do you have with two or more interfaces? The higher the number the more likely you are to see this.

While I won't say this scenario is frequent, it should be non existent. Most recently I've witnessed it when going from RHEL3 to RHEL4 on a system with five interfaces. Could this be RedHat's fault? Possibly, though based on other experiences I suspect this is more of a 2.4->2.6 issue.

The newest development model and 2.6.14

Posted Nov 4, 2005 22:11 UTC (Fri) by tjw.org (guest, #20716) [Link]

It could very well be RH's problem for changing the modprobe order. If you have different chipsets among your interfaces, the ethX name will be dependant on which kernel module gets loaded first.

You may be able to avoid these problems by just adding something like this to your modules.conf:

alias eth0 e100
alias eth1 e100
...
alias eth5 tulip

The newest development model and 2.6.14

Posted Nov 5, 2005 0:07 UTC (Sat) by emkey (guest, #144) [Link]

They are all e1000's in this case. And actually we have six total, though only five are in use. Two are built in copper, and the other four are provided by dual fibre E1000 cards. The first two remained the same (eth0, eth1) after the upgrade. The cards swapped though. (IE, eth2, eth3 became eth4, eth5 and eth4, eth5 became eth2, eth3)

A few minutes with tcpdump solved the mystery as to why things weren't working properly after the upgrade.

There may be a way around this, but there really shouldn't need to be.

The newest development model and 2.6.14

Posted Nov 6, 2005 23:47 UTC (Sun) by zblaxell (subscriber, #26385) [Link]

I think the real problem is that there is no guarantee of stability in unit enumeration in most distros. The mechanism certainly exists in the kernel, and has existed since 2.1.somewhere-near-100, but there isn't a user-space implementation installed by default on most distros.

eth0 is "the first detected ethernet card", eth5 is "the sixth detected ethernet card". If the eth0 card's PCI bus controller dies (as mine did a year or two ago), suddenly eth5 becomes eth4, eth4 becomes eth3, etc. This does horrible things if you were enforcing some kind of security on those devices, and the machine manages to come back up after this sort of failure. This can be triggered with just a lightning strike and a reboot--same kernel version, same distribution, but suddenly some or all of the ethernet cards have new names because a low-numbered one got zapped. It's inconvenient, but don't blame the kernel developers for breaking your fragile configuration.

This isn't a new problem that arose in 2.6, it has *always* been there. Use 'ip name ...' or 'nameif' to force your network devices to have specific names that don't match any possible default name. Set up your routing and firewall rules to use the specific names, and firewall everything that has an anonymous "eth0"-style name to the DROP target. Once configured, your interfaces will never be renamed again, although now you'll have to update the MAC addresses table every time you swap out a card or build a new machine.

Distro vendors could help people in your situation (only read half of the manuals, built a broken configuration, got surprised when the interest payment on technical debt became due) by including a user-space tool which assigns dynamic but persistent device names, so "eth0" would become "the first ethernet card *ever* detected in the system", "eth1" would be "the second ethernet card *ever* detected," etc. Single-user systems would only see "eth0", gateway hosts would have "eth0" through "eth5" that behave the way you expect between reboots, machines which replaced a broken NIC would have just an "eth1" since there's no way for the system to know if the card-formerly-known-as-eth0 might come back one day. It might be somewhat inconvenient to replace a card (you'd have to update routing table and firewalls for eth1 instead of eth0), but that's what you get for not reading the manual--and if you did, you'd probably find the state file that defines the persistent mapping and just edit it manually.

The newest development model and 2.6.14

Posted Nov 7, 2005 2:23 UTC (Mon) by emkey (guest, #144) [Link]

In our situation we would never come up with a networking card "missing". We have mechanisms in place to make sure this doesn't happen.

Being able to tie a device to a given MAC address is potentially interesting though.

As for not reading the manuals, I don't read the source code to the kernel either. The environment I work in takes hundreds of pages to document and is changing in small ways on a daily basis. I read LWN and am always on the lookout for new sources of information, but the simple truth of the matter is I do not have the time to be an expert in every single aspect of Linux, networking, etc. I wish I did.

Thanks for the information.

The newest development model and 2.6.14

Posted Nov 7, 2005 2:29 UTC (Mon) by zblaxell (subscriber, #26385) [Link]

Last time I checked, mknod in /dev still works if you want it. udev is just a glorified automated mknod in /dev which the kernel invokes from time to time. devfs had a lot of kernel-side bureaucracy that you could only get rid of by removing it from the kernel, which as of 2.6.13 has now been done permanently. Your distro vendor may give you some trouble, if they've built the system to rely on devfs or udev.

I experienced significant slowdowns on my laptop running 2.6 after upgrading from 2.4, until I configured the I/O scheduler to use cfq instead of the anticipatory scheduler. The defaults seem to be tuned for systems with a pair of high-performance SCSI disks arranged in RAID0 or RAID1...but on a laptop hard drive they multiply boot times by 10.

Measuring memory usage is different in 2.6. There are some new statistics, and statistics with old names are calculated differently, so it's hard to do a 1:1 comparison--and that in and of itself is annoying. It's hard to tell if there are more programs waiting for disk I/O because of increased RAM usage, or due to a new block I/O scheduler or some new kind of lazy or preemptive swapping scheme.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds