LWN.net Logo

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Linux.com looks at command line tools for troubleshooting your system. "When something goes wrong with your Linux-based system, you can try to diagnose it yourself with the many troubleshooting tools bundled with the operating system. Knowing about these tools, and how to effectively use them, can help you overcome many of the common problems on your system. Here's a list of some of the weapons in your arsenal against Linux problems."
(Log in to post comments)

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 19, 2007 18:38 UTC (Mon) by aisotton (subscriber, #39278) [Link]

Nice article, but I really hoped for more. I have to admit that I didn't know about ltrace though.

A few tools which are missing from the list:

  • netstat - find out what your system's network stack is doing.
  • nc - check whether services are running, transfer files, run commands, transfer files and soooo much more.
  • du - see how big files/directories are.
  • vmstat - information about memory. You can get the important parts of the information from top too, but when a system is heavily overloaded vmstat just runs faster.
  • ps, kill, killall - classics which needs no description.

Does anybody else know about more useful tools for quick troubleshooting?

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 19, 2007 19:39 UTC (Mon) by Ross (subscriber, #4065) [Link]

For memory-leak problems you might also use:
ipcs # shared memory or other IPC leaks
xrestop # X resource leaks
memstat # memory usage by process and file

Lots of "development" tools can also be useful for finding crashes or
memory leaks:
gdb
valgrind

glibc also has some environmental variables to help find (or work around) buggy usage of malloc/free:
MALLOC_CHECK_=0 # heap corruption ignored
MALLOC_CHECK_=1 # heap corruption printed to stderr
MALLOC_CHECK_=2 # heap corruption causes abort()

Also, just doing ls -l in /proc/<pid>/fd can be enlightening when trying
to figure out what a program is doing.

-Ross

netcat

Posted Feb 20, 2007 4:02 UTC (Tue) by ldo (subscriber, #40946) [Link]

Just a note that nc might be called netcat, depending on your distro. On my Gentoo system it's nc, on SuSE it's netcat.

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 21, 2007 13:35 UTC (Wed) by k8to (subscriber, #15413) [Link]

Unfortunately ltrace doesn't have nearly as much smarts as strace in decoding arguments. In fairness the problem space is HUGE by comparison, so I'm not sure you can realistically expect it, but I find I often can't really make heads or tails of an ltrace, while an strace is often quite straighforward.

The issue being all the interfaces that take pointers. strace typically shows you the struct entries, ltrace typically shows you the location and that's it.

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 19, 2007 21:08 UTC (Mon) by foo (guest, #1117) [Link]

I recently had my Firefox session lock up while I had a fairly
long email in a text area, which I didn't want to lose.
I did a "gdb /path/to/firefox-bin [FIREFOX-PID]", forced it
to return from the routine it was stuck in, exited the debugger,
and everything was back to normal.

Sometimes it's good being a nerd. :)

Impressive!

Posted Feb 20, 2007 1:47 UTC (Tue) by pr1268 (subscriber, #24648) [Link]

Thanks, foo, for the suggestion! I've never tried what you just mentioned, but it seems like a nice way to force errant apps to behave. (Not that this happens that often to me running Linux.)

I've been hacking Linux for about 9 years, exclusively for almost 3, and yet I'm always learning something new about it.

Oh, and one more slightly off-topic idea: Try doing foo's method to $HUNG_PROCESS in Windows. :-D

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 20, 2007 2:30 UTC (Tue) by Ross (subscriber, #4065) [Link]

How did that work when there were file descriptors open (including the connection to the X server)? It seems like you would have to reopen them manually and restore any state (like authentication with the server or refilling any open pipes).

If you always start Firefox under gdb then it's a different story :)

-Ross

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 20, 2007 6:23 UTC (Tue) by donio (subscriber, #94) [Link]

He attached gdb to the running process by specifying the pid as the
second argument to gdb.

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 20, 2007 14:46 UTC (Tue) by foo (guest, #1117) [Link]

Exactly. If you give gdb a second argument that's not
a core file, he'll interpret it as a PID and attach
to it. When you exit gdb, he detaches and everything's
copacetic.

If you just want to see where a process is spending time,
doing an "strace -p PID" can attach to a running process,
and later detaches properly with a Ctrl-C.

And to be fair with the comment above, I'm sure you can do
the same thing under Windows, if you have a debugger
installed.

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 20, 2007 16:24 UTC (Tue) by Ross (subscriber, #4065) [Link]

Ah, I misread "lock up" as "crashed". I often have the latter problem, especially when trying to save a link in Firefox, but haven't seen too many lockups.

-Ross

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 20, 2007 22:45 UTC (Tue) by Los__D (guest, #15263) [Link]

I have a feeling that you are not a Flash user :)

At least under my Ubuntu installation, the Flash player has a habit of locking Firefox quite hard, especially when sound is involved.

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 20, 2007 21:37 UTC (Tue) by robert_s (subscriber, #42402) [Link]

You are so lucky that worked.

hex dump tools

Posted Feb 20, 2007 4:01 UTC (Tue) by ldo (subscriber, #40946) [Link]

Just checking a client's SuSE 10.2 system, and I find no less than three tools for doing file dumps: the aforementioned hexdump (part of the util-linux package), also od (part of coreutils), and xxd (comes with vim, of all things!)

hex dump tools

Posted Feb 20, 2007 6:27 UTC (Tue) by donio (subscriber, #94) [Link]

Emacs also comes with hexdumper called hexl, tucked away under libexec.

First Rule of Diagnostics

Posted Feb 20, 2007 4:08 UTC (Tue) by ldo (subscriber, #40946) [Link]

By the way, you are strongly encouraged to try out diagnostic tools on a correctly-running system or programs. Getting a feel for what correct output looks like can be very helpful when things actually go wrong.

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 20, 2007 21:46 UTC (Tue) by robert_s (subscriber, #42402) [Link]

I'll just add tcpdump and the tools that come with dsniff to that list. Great for sorting out bizarre network problems.

CLI Magic: Linux troubleshooting tools 101 (Linux.com)

Posted Feb 21, 2007 6:22 UTC (Wed) by speedster1 (subscriber, #8143) [Link]

Not to mention wireshark (formerly ethereal)! When I've got a tricky networking problem, I usually use tcpdump to do captures (on both sides of the connection when possible) then analyze them on my laptop using wireshark.

Use tcpdump to store full packets to file instead of displaying just the headers:

tcpdump -s0 -w <file> -i [ethN | any] [optional filter expression]

Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds