|
|
Subscribe / Log in / New account

Williams: That's when I reach for my revolver...

Dan Williams examines the vagaries of mobile broadband cards in a posting on his blog. He reports on the problems when trying to get NetworkManager working with all of this different hardware. "Yes, there are standards. But as we all know, given 10 people and a standard, you'll end the day with 12 or 13 differently behaving "standards-compliant" implementations. People suck. You’d think it would be easy to agree on an AT command for "prefer 3G / prefer 2G / 3G only / 2G only". NO SIMPLE FOR YOU. But NetworkManager has to work around huge amounts of stupid. Here's a run-down of some of the mobile broadband hardware that’s available today and what about it sucks."

to post comments

Williams: That's when I reach for my revolver...

Posted Mar 23, 2009 20:18 UTC (Mon) by davecb (subscriber, #1574) [Link]

Euargh!

My leaky memory says a colleague created a
VFS just for modem brain-damage, with each
operation having a pointer-to-function
and a pointer-to-string-parameter. The
usual code was 'send string and expect
return code', but sometimes there was a whole
mess of stuff in the code (;-))

--dave

Williams: That's when I reach for my revolver...

Posted Mar 23, 2009 21:20 UTC (Mon) by rathann (subscriber, #50815) [Link] (5 responses)

That's great. Now, if you could give me a configurable timeout for wifi association and dhcp like I asked, it'd be greater still. But you keep refusing, so no cookie for you until you do.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 0:10 UTC (Tue) by pabs (subscriber, #43278) [Link]

No cookie from me due to the silly 'lets break the network when the daemon restarts' attitude.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 10:03 UTC (Tue) by dcbw (guest, #50562) [Link] (3 responses)

Figure out *why* your associations take so long, and maybe I'll consider it. But saying you want a timeout without trying to see what the real problem is, where the real problem is triggered, and trying to fix that *first* is what I object to.

There's no reason why you need a longer association timeout if things work correctly. It should not take more than 20 or 30 seconds to associate with an access point. If you are near the margins of the network, move closer, deploy another access point in the weak coverage area, or get a better client antenna. If there is a lot of intereference, you need to reconfigure the wifi network to use non-overlapping channels, or use the 802.11a band so your network isn't overrun with microwaves from the break-room at lunch.

Hacking around shit with one-off config options that don't actually fix the source of the problem is not the way to make things better; it's a way to create an unmaintainable, untestable pile of junk.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 10:36 UTC (Tue) by rathann (subscriber, #50815) [Link] (2 responses)

You can't say you expect people to wander around with their laptops in hope of getting just a bit stronger signal and not be laughed at. You say association shouldn't take more than 20s but in reality it can and it does. You can neither fix all wifi deployments in the world to have perfect coverage nor expect everyone to meet your ideal conditions. I won't repeat what I've already written in the Fedora bug report.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 13:17 UTC (Tue) by dcbw (guest, #50562) [Link]

But there are things you can do driver-side and supplicant side that may significantly fix the problem. And the user can certainly request better coverage from their administrators if they don't control the access points themselves, or get better wifi cards. I'm not opposed to increasing the association timeout.

I'm opposed to blindly increasing it with no specific reason *why*, and no attempt to figure out what the real casuses of connection failures are.

Williams: That's when I reach for my revolver...

Posted Mar 25, 2009 11:22 UTC (Wed) by nhippi (subscriber, #34640) [Link]

If the association takes over 20sec, it is likely the rest of wifi usage is going to be painful too, and the user will start wandering around for better reception anyway.

If OTOH only the association is slow, and the rest of networking is reliable, it is a bug somewhere in the stack. And it should be rather fixed than worked around (by adding a configuration option, in this case)

"Make that thing configurable" is often a sign of "cult of workarounds". The cultists prefer enforcing endusers to twiddle settings randomly until things work, instead fixing the underlying bugs.

Williams: That's when I reach for my revolver...

Posted Mar 23, 2009 23:37 UTC (Mon) by endecotp (guest, #36428) [Link] (5 responses)

Very timely for me. My cable connection got flaky at the weekend (277 bytes per second!) and I realised how much I depend on it for important stuff, like earning enough to pay the rent. So I was considering getting a 3G gadget of some sort for emergency use. There are two challenges; Dan's article describes one, and the other other, which I think is just as daunting, it deciphering what they cost.

The article is about getting these things to work with network manager. But in my scenario I would want to plug it in to something like my NSLU2 or even my OpenWRT router to provide connectivity for the whole LAN. My NSLU2 isn't going to be running the GUI network manager app: I would need to be able to configure the thing at the "ifconfig" level. I do hope that they aren't putting too much of the support for these things at too high a level.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 0:14 UTC (Tue) by pabs (subscriber, #43278) [Link] (3 responses)

In 0.7 there are command-line clients for NM and the GUI is separate from the network management daemon.

NM isn't the only network management solution in town, there are at least wicd & connman.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 0:43 UTC (Tue) by sbdep (subscriber, #13282) [Link] (2 responses)

My experience trying to get 3G type of data cards to work,is that they are usually configured through pppd. So appropriate configurations of a named ppp peer with appropriate initstring/chatscripts to configure the data card itself, and you should be able to configure the connection to integrate into your distributions normal network management tools, and ignore NetworkManager.

Depending on the card, it usually takes a udev rule to prod the card/dongle to switch from USB mass storage to usb modem mode (for USB dongle devices that need this prodding), then a ppp peer config and a chatscript to tell the card to initialize and connect to the network, then "dial" the magic number to get a data connection.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 1:47 UTC (Tue) by drag (guest, #31333) [Link] (1 responses)

> My experience trying to get 3G type of data cards to work,is that they are usually configured through pppd.

Well you see that's the problem. That is not really true. It never really was, but it is rapidly getting less and less true.

Like the blog says you can have different interfaces for each card and often you can have multiple serial devices, 3 or more, for configuring a card. The pppd tools are not really capable of configuring them correctly.

---------------------

Take, for example, my Sony 3G phone. I can configure it using pppd and use it like that, but that is essentially 'gimp legacy mode'. The thing sets up a virtual ethernet port over USB and that is what is actually suppose to be used. What is suppose to happen is that the OS sends configuration stuff over one of the serial connections and then you connect through the usb-ethernet adapter.

The reason they are doing this is because PPP has too much overhead and it will choke out over USB before the user can get high speed internet access.

Plus I expect there are all sorts of extra settings that you miss out on just using regular pppd and scripts.

So if you were to benchmark the network performance of Linux vs Windows over celular data networks you'll find that Linux is usually slower, has more reliability issues, and tends to have higher latency.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 10:14 UTC (Tue) by dcbw (guest, #50562) [Link]

Yeah, 'hso' cards, newer Sierra or Huawei cards, and Ericsson F3507g/MD300/Dell 5530 just wouldn't work with pppd, because they don't use ppp. You'd have to run wvdial with the AT commands to set the APN and the rest of the config, then have wvdial somehow parse the AT_OWANDATA response on 'hso' cards to get the IP address and DNS servers (or run DHCP on Ericsson cards) and then have wvdial set up the IP interface and your /etc/resolv.conf and whatever. Seems like something wvdial wasn't really written for.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 0:52 UTC (Tue) by camh (guest, #289) [Link]

I am using a Huawei E169 USB 3G modem on my linux box. I configured it with just pppd and chat. To activate the connection it is a simple matter of starting pppd.
The E169 uses the "option" driver in the kernel. Earlier kernels (maybe about 2.6.26 and earlier) need the usb_modeswitch utility to switch the USB stick from a "CDROM" with drivers to the actual modem. In the later kernels, the "option" driver does this automatically.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 1:04 UTC (Tue) by ras (subscriber, #33059) [Link] (9 responses)

For the most part a wonderful article. It should be mandatory reading for the people who design these bloody modems. It never ceases to amaze me how we engineers can make a simple thing, such as sending a data stream over a wireless connection so unnecessarily complex.

But he lost me at the end when he said:

> But that’s why NetworkManager rocks; we pony up the cash to make sure our shit works.

He must be living in some parallel universe. I have _never_ had the good fortune to see NetworkManager work as it supposed to.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 2:03 UTC (Tue) by drag (guest, #31333) [Link] (1 responses)

I think his point is that he and the 'NM crew' are actually going through the effort to get this shit working, which nobody else is really doing. At least nobody working on making it easy for users to auto-configure this stuff.

This 'modem' effort is relatively recent push on the part of the network-manager folks.

Previously they concentrated on 802.11 wireless and that sort of thing and thus far, as my personal experience can tell, that stuff works very well now.

Of course it is not perfect, but it is much better then the nightmare configuring Fedora for multiple networks used to be. (hell, getting my workmate's Motorola phone working on it took multiple weeks to get working in a usable manner. And even after that he would have to occasionally log into Windows when it would refuse to connect after a while)

Although I found Debian's configuration stuff worked pretty well for command-line user types. Quite liked it once I figured it out..

I think that Network-manager is more capable, and hacker friendly, then people think it is. Originally it was horrible.. but its had a couple rewrites as far as I can tell. I figure if somebody doesn't know what Network-Manager dispatcher is for then they are probably missing out on some cool possibilities.

The only thing that has been missing out for me is good command line tools, but they seem to be fixing that.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 10:22 UTC (Tue) by dcbw (guest, #50562) [Link]

I recently had to fix a bug in NM 0.3 in RHEL4. Yes, old versions were pretty bad; but that's always the case with release-early/release-often OSS software. If you don't get it out there and get people using & testing it, it doesn't get better.

Was having that discussion the other day with somebody too... With Mac OS X for example, you might actually be able to create new software that doesn't need to go through a few iterations before it's actually useful to most people, becuase there isn't as much variation in hardware or OS.

But with Linux, so many people use it in so many configurations, and with so much hardware (and so many drivers of differing quality), that if you don't release stuff early and get people testing it, you'll just go through the same pain whenever you release it later.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 10:09 UTC (Tue) by dcbw (guest, #50562) [Link] (1 responses)

If NM 0.7 hasn't worked for you, and you're using upstream in-kernel device drivers (not -staging crap), then I'd love to hear from you about what your problem is. If you're using NM 0.6, you probably want to upgrade to a newer distro or version that uses NM 0.7.

*Most* of the problems NM has are caused in whole or in part by bad drivers, and that's why its important to have the code so we can fix those drivers. Binary drivers, on the other hand, cannot be fixed, and no maintainable solution can be had by working around their bugs.

And of course there are bugs in NetworkManager, and features that people want that aren't yet implemented. For example, when the wifi connection fails (timeout, AP forcibly disconnected you, you loose signal, etc) nm-applet will pop up the "Type In Your Key" dialog. People hate that. I understand why. The underlying cause was either (a) the driver failed, or (b) the key was wrong, or (c) the access point is crap. So I'm going to work around that by making NM Just Try Harder.

But there comes a point when Just Try Harder fails, because the driver or hardware have problems, and NM just can't work around it any more. Then we get to fix the driver anyway, which we should have done the first time around anyway.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 23:46 UTC (Tue) by ras (subscriber, #33059) [Link]

I avoid NM now, so I can't really give you the sort of information you need. Actually my usual response to finding something is somewhat nicer than that. I find the bug and then send in a bug report + patch. I can only give you my experience with NM.

My first clash with NM was when someone complained to me they were not getting a DHCP lease. At the time I didn't even know what NM was, or that it was running, but soon found out when I used my the normal tools to debug the problem (ip, ping, dhclient, etc), and then something was undoing my changes. Things obviously changed, but the configuration files for those tools were untouched. Weird, it was like there was a ghost in the machine. Sniffing around I noticed some of the daemons had connections to dbus, which was a new development. I dumped the dbus messages, and eventually the path lead to NM. So it appeared someone had decided to redesign the entire configuration system for networking, moving away from configuration files I could look at in conjunction with man pages and make some sense of. Instead I was confronted with undocumented babble flying across a software bus. You can probably tell how impressed I was with that development.

Still, the solution was simple enough. Disable NM manager, debug the problem as I would normally, then re-enable NM. The problem was unrelated to NM - it was a bug the in firmware of a wireless router. The only issue NM caused was one of visibility - while NM was running it was impossible to tell at what point things were going wrong.

My second clash with NM was on a machine that has a broken wireless switch. Ie it had an internal wireless card that was fully functional, but because of the broken switch you could not turn it on. The owner plugged in a second wireless card, which worked perfectly when I configured it manually. Network manager refused to do anything with the second wireless card. I considered finding the bug in NM, but then visions of all those dbus interconnections rose up and I thought better of it. Any event driven daemon with its fingers in that many pies was probably a nightmare inside. So I just blacklisted the first wireless card, and NM was happy.

There were other clashes whose details escape me now, but suffices to say I have never had a machine which NM "just worked" on. There was always some situation in which it bugged out. Admittedly I tend to work on problems which have defeated others, so the machines tend to have some kink. But not the final one, which occured last weekend. It was a newly installed Ubuntu 8.10 machine. The owner wanted to use one of those 3G USB dongles described the article. To NM's credit the dongle just worked. Having proved the dongle worked we disconnected it, and asked NM to reconnect to 802.11. Do you think it would do that? Nope. Again I had not trouble doing it manually, NM wouldn't. In the end I just told the guy to reboot his machine if he wanted to change from 3G to 802.11.

So now we come to this 3G modem thing. It is great you NM guys are working on it. But what have you produced? I haven't looked, but is it a generic tool we can all use to probe these things, or some code deep within the guts of NM that only NM can use? It is not just an NM problem. Servers and openwrt boxes also have this problem, and they don't run NM.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 13:05 UTC (Tue) by dwmw2 (subscriber, #2063) [Link] (4 responses)

"He must be living in some parallel universe. I have _never_ had the good fortune to see NetworkManager work as it supposed to."
I was saying that a year or so ago, but I've changed my mind. Since then, I have had it working relatively sanely on a few machines. It is getting better.

That isn't to say there aren't still some problems though — the lack of support for Bluetooth DUN and PAN is painful, although hopefully that should be fixed soon.

Watching the system automatically unmount all my NFS file systems while I was in the middle of a 'yum update' from NFS, just because there was a brief power glitch to the Ethernet switch, wasn't a particularly fun experience either...

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 13:21 UTC (Tue) by dcbw (guest, #50562) [Link] (2 responses)

Just for you David, I got an SE phone with DUN that works on T-Mobile's HSDPA network where I already have an unlimited data plan. So now I have the ability to easily debug and fix issues with Bluetooth.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 13:25 UTC (Tue) by dwmw2 (subscriber, #2063) [Link] (1 responses)

Cool. Does it do PAN too? Although PAN isn't particularly hard to set up on another Linux box...

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 13:37 UTC (Tue) by dcbw (guest, #50562) [Link]

No, it doesn't do PAN, but adamw is shipping me a HTC WinMo phone that does.

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 13:36 UTC (Tue) by dcbw (guest, #50562) [Link]

And I'd like to point out that largely *because* NetworkManager usually doesn't work around stupid drivers and bad infrastructure, but instead encourages developers (including myself) to fix that infrastructure and drivers, we've come quite a long way in driver quality over the past few years.

NetworkManager is both the carrot and the stick. If NM just worked around broken stuff and proprietary drivers, it would be a hacktower of doom and we may still be stuck largely in 2006-wireless land.

Examples of driver/stack fixes due to NetworkManager:

- mac80211 adhoc mode association event and ibss merging timeout fixes
- fix wpa_supplicant adhoc/infra mode switching
- driver scan capabilty advertisement (automatically handle ap_scan=1/2)
- always respect specific SSID probe scans
- always return scan success/fail information instead of dropping it on the floor leading to userspace timeouts
- age scan results on resume so userspace doesn't get stale AP lists
- driver conformance to WEXT APIs, especially for WPA support around 2006/2007
- D-Bus control interface for wpa_supplicant
- Huge input to next-generation kernel wireless APIs so we don't repeat the mistakes of WEXT (especially with feedback from kernel->userspace, a large weakness of WEXT)
- improved ethernet driver support for carrier detection

Williams: That's when I reach for my revolver...

Posted Mar 24, 2009 18:40 UTC (Tue) by AlexHudson (guest, #41828) [Link]

I have a Huawei 169 3G stick, and a recent update in F10 broke it. It was interesting to read the article and see what NM has to do with this type of hardware; to be honest, I've lost basically all my pppd knowledge and without a GUI I would be pretty much at sea unless I was going to re-learn all that stuff again (which, frankly, I don't want to do).

Dan's update to F10 today fixes the problem for me, and I'm in "just works" land again, thank goodness. Plug in stick, NM kicks it, I'm online.

I would go as far as to say NM is better than any other system I've seen on any platform; it simply rocks. It doesn't always work, it has rough edges, it's not perfect for the static IP crowd yet. But my goodness, it's pretty darn keen.

Same for Nokia E71

Posted Mar 25, 2009 1:59 UTC (Wed) by dag- (guest, #30207) [Link]

I cannot find anywhere information how to instruct my Nokia E71 to select HSDPA. It works fine both using bluetooth and the USB cable, but whatever AT commands I tried that I found for other devices, I have yet to find the one that works for the Nokia E71 :-(

Very concerning, especially if you need to make it just work with requiring some closed logic from the vendor...


Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds