LWN.net Logo

Ethereal and security

When Coverity released its first set of results from its defect scanning of a number of free software projects, the Ethereal protocol analyzer turned up with one of the lowest defect densities of all. Your editor, when posting the initial results, commented that the low defect density did not seem entirely consistent with the rather high density of security advisories for Ethereal. That comment did not sit well with the Ethereal developers, with one observing that "The article reads as if it was written by an amateur, not a professional with a proper grasp of sentence structure." Oh, well, your editor never claimed to be a "professional."

The original comment was unnecessary, however, and apologies are offered. In an attempt to make amends, your editor decided to take a closer look at Ethereal and its approach to security. What much of the world sees is a long list of security advisories and little else; if there is a larger story, it has not been told outside of the developers' lists. As it turns out, there is, indeed, a larger story.

The list of Ethereal security advisories is indeed long. The six advisories issued in 2005 enumerate 105 different security-related bugs, a number of which are of the form "several dissectors will do something unpleasant in these circumstances." There are 23 different CVE numbers cited. The Ethereal security page gives a number of suggestions for running Ethereal in a more secure way (don't run as root, use something like tcpdump to capture packets, etc.), and notes that "The Ethereal developers agree that the current situation isn't actually satisfying." Your editor, it seems, is not entirely alone in noting that some security issues may exist with Ethereal.

Ethereal has a couple of special challenges. One is that it must deal directly with arbitrary data which may have been specially generated by hostile parties. Any set of bits can come off a network, and Ethereal must do the right thing with it; most applications, instead, receive a cleaner and more controlled input stream from the outside. Ethereal also must deal with a wide variety of packet types, which leads to the inclusion of a large library of protocol-specific "dissectors." These dissectors bear some resemblance to device drivers in an operating system kernel: they are specialized, written by a diverse group of authors, and can be hard for others to review and test. And, as with drivers in the kernel, dissectors are the source of a large percentage of Ethereal bugs.

Ethereal vulnerabilities can also be serious. While problems in packages like cube, zoo, or tetex are very much worth fixing, the chances of systems being compromised by those vulnerabilities are relatively small. Ethereal, however, is a tool used by system and network administrators. Known vulnerabilities in Ethereal can be used to compromise an administrator's system; all that is required is the injection of a suitably-crafted packet onto a network where Ethereal is running. So Ethereal vulnerabilities could be especially attractive to an attacker with a specific target. This fact can be driven home by doing a quick search for Ethereal exploits; a number have been posted over the years.

So the Ethereal developers clearly need to keep security in mind. The good news is that they seem to be doing exactly that. While some of the vulnerabilities disclosed in 2005 were found by outside parties, the vast majority of them were turned up by the Ethereal hackers themselves. The developers, it seems, are putting some significant effort into finding problems before hostile outsiders do. This activity nicely explains both the large number of advisories and the small number of defects in the current Ethereal code base.

Clearly, the right kind of work is being done. Here (from the Ethereal security development page) are some of the things the Ethereal developers are doing to improve the security of their project:

  • Fuzz testing. As has been discovered in many applications, the feeding of random data to a program can turn up all kinds of interesting behavior. Ethereal has a "randpkt" utility which feeds entirely random data to the system. There is also an "editcap" program which introduces random corruption into files containing streams of real packets. Any dissector which is not truly paranoid about the data contained in the packets presented to it will eventually be caught out by a fuzzed packet.

  • Automatic code generation. Rather than hand-crafting code to deal with the structure of every packet type, the project is looking at generating dissector code from a description of the packet format. Once the code generator has been verified as safe, the resulting dissectors should be much more solid. Code generation is being used in a number of projects (Samba 4, for example) to produce better code in less time; Ethereal is machine generating some of its dissectors now, with an eye toward generating most or all of them at some point in the future.

  • Various changes aimed at avoiding dangerous code. These include core API changes to make certain kinds of errors harder to create. The tvbuff abstraction, for example, allows a portion of a packet to be passed to a dissector and catches any attempts to access data outside of that area. The Ethereal developers are also making a (somewhat belated) effort to stop using dangerous C library functions like sprintf() and strcat().

Throw in techniques like privilege separation and good, old-fashioned code review, and the result should be a relatively secure package. Perfect security is hard to come by, and Ethereal users should still stay on top of their updates. But the Ethereal developers appear to have a handle on the problem and are trying to do the right things. If all free software projects took security as seriously, our systems would be rather more solid.


(Log in to post comments)

Ethereal and security

Posted Mar 16, 2006 2:29 UTC (Thu) by bk (guest, #25617) [Link]

It seems to me like priviledge separation should be the absolute first thing to implement. Why are they doing all this fancy automated code generation stuff when you can massively decrease the incidence and severity of vulnerabilities by not running every line of code as root? This seems almost too obvious to bother saying, yet it hasn't been done yet.

Ethereal and security

Posted Mar 16, 2006 4:54 UTC (Thu) by bradfitz (subscriber, #4378) [Link]

While I agree with you, keep in mind that privsep only helps if the users are keeping their kernel updated. A lot of users keep userspace more up-to-date than their kernels. With an out-of-date kernel, there are generally dozens of privilege escalation holes, so any security hole effectively means owning the entire machine.

Ethereal and security

Posted Mar 16, 2006 10:08 UTC (Thu) by nix (subscriber, #2304) [Link]

On top of that, unless you're running something like AppArmor, a security hole in Ethereal still means the attacker 0wns the account you're running Ethereal as. So privsep doesn't make security go away, by any means.

Ethereal and privilege separation

Posted Mar 18, 2006 0:39 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

Privilege separation encompasses separating the program from those privileges as well -- the ethereal process doesn't have to be owned by the same uid that owns the shell from which it was launched.

Ethereal and security

Posted Mar 17, 2006 3:31 UTC (Fri) by ncm (subscriber, #165) [Link]

No, privilege separation really can help, if a small amount of risky but well-audited code runs as root, but doesn't try to interpret formats; and
a second smallish and well-audited part, not necessarily running as root, just validates input and then passes along its deconstructed analysis of the packets to the UI part of the program. Then, the huge and not-so-well audited UI part would never encounter badly-constructed input.

That's not to say that Ethereal is doing that.

Ethereal and security

Posted Mar 16, 2006 2:53 UTC (Thu) by cventers (subscriber, #31465) [Link]

> Ethereal has a couple of special challenges. One is that it must deal
> directly with arbitrary data which may have been specially generated by
> hostile parties.

Ethereal does indeed have challenges, but I don't know if you could really
call them 'special'. When it comes to programming, I'm of the DJB/qmail
school of thought - don't trust a thing. Ideally, there should be no way
for external input not consistent with your expectations (be it keystrokes
or packets) to crash your program.

Ethereal and security

Posted Mar 16, 2006 19:07 UTC (Thu) by cthulhu (guest, #4776) [Link]

I'm totally in agreement with you on this. I gave it some thought, though, and realized that for Ethereal, it's hard because there is not one correct format for a packet. If we were dealing with a file format, there might be some relatively small number of variations in the format and once you validated the input, you could go and trust the data.

But for packets, essentially every packet dissector needs to do this at every step of the way, because almost every stream of bytes can be valid. It's precisely the dissection that needs to do the validation, since that's where the protocol smarts are. Seems like writing dissectors in some sort of higher level language is a good way to make sure the validation is built in.

Ethereal and security

Posted Mar 16, 2006 16:12 UTC (Thu) by swiftone (guest, #17420) [Link]

I know nothing of Ethereal security practices, however I worry that you refer to the original comment about the number of security advisories vs the low number of defects/line calculated as "unnecessary".

The comment was a welcome reality-check on the Coverity study. Following up with an examination of why that disparity exists was the only part missing.

There is plenty of white-washing and unquestioning lack of detail in the press already. We need more probing questions (and answers), not less.

Ethereal and security

Posted Mar 16, 2006 16:31 UTC (Thu) by RobSeace (subscriber, #4435) [Link]

> The Ethereal developers are also making a (somewhat belated) effort to stop
> using dangerous C library functions like sprintf() and strcat().

They're only dangerous if misused... You can also misuse their "safe"
replacements, rendering them equally "dangerous"... (Eg: passing an
incorrect buffer size to snprintf()/strl*(), failing to handle the case
where strn*() doesn't null-terminate, etc...) I really wish people would
stop promoting this idea of "safe" vs. "dangerous" lib functions... It
makes people think all they need to do to be safe is switch from one set
of functions to another, and all their problems will be solved... But,
they won't be if you're still writing bad code... It doesn't matter WHAT
lib functions you're using: if you don't know how to use them correctly,
you're likely to write insecure/buggy code... Ie: the problem lies with
the programmers, not their tools... As they say, it's a poor craftsman who
blames his tools for his failures...

Now, that said, you CAN legitimately make the case that some functions are
a lot easier to misuse than others... So, I'm certainly not opposed to
steering people towards the harder to misuse ones... But, suggesting (or,
sometimes flat-out saying) that ALL uses of function X amount to a bug is
highly unfair... (There are plenty of legit uses of sprintf()/strcpy()/etc...
As long as you have full control over all input, and are fully aware of the
buffer sizes involved, there's not really a problem... But, yes, there are
also plenty of cases where you should NOT use those functions, as well...)

There are some exceptions to the above though... A few lib functions are
indeed just horrible ideas which should never have been created in the first
place, and ANY use of them is almost guaranteed to be a bug... Eg: gets()...
It's pretty much completely impossible to use that in any way that does
not present a vulnerability, unless you are the only user of the app, and
can guarantee no one else will ever have an opportunity to enter input for
it... Which isn't a very useful real-world scenario... But, anyway, such
functions are pretty rare, really... And, things like sprintf()/strcpy()
aren't really in the same class... Yes, they are easy to misuse, and poor
programmers often DO misuse them quite a lot, so it's right to perhaps be
a bit suspicious of their use... But, there are also lots of ways to use
them perfectly safely, so one shouldn't just assume their use always
inherently represents an insecurity, nor should you assume their replacement
by so-called "safe" alternatives necessarily means the code is any more
secure than it was before...

Ethereal and security

Posted Mar 16, 2006 17:08 UTC (Thu) by kamil (subscriber, #3802) [Link]

I completely agree. It always irritates me when I encounter remarks from "experts" of the form: "That code is insecure. I've grepped through the source and found 50 uses of sprintf".

Ethereal and security

Posted Mar 16, 2006 19:10 UTC (Thu) by Ross (subscriber, #4065) [Link]

Amen. I've found misuse of the strn*() versions (usually missing termination) about as often as the standard versions. Using length-limited string operations is not the easy fix people seem to think.

Though, having said that, I wonder why the strl*() functions can't be standarized. They are arguably the easist to use because they match people's unconscious expectations.

Ethereal and security

Posted Mar 17, 2006 14:44 UTC (Fri) by bronson (subscriber, #4806) [Link]

The reason why strcat et al are considered unsafe is because of maintenance. Of course, the following code, as written, is safe:
if(strlen(s) >= bufsiz-1) {
return FAIL;
}
strcat(buf, s);
Then, 6 months later, someone applies a bug fix that inserts a few lines between the strlen and strcat. Some more code churn. A few months later a hurried programmer changes the function so it now copes with the error instead of returning FAIL and misses the connection to the strcat 10 lines away. And now you have a cleverly disguised buffer overflow bug. Maintenance is a random, scary process. Occasionally mistakes like this will happen.

So, as long as your source code never changes, then strcat and sprintf are safe. Otherwise, let's just hope you forever maintain all the code that you wrote. The strn calls don't fix all buffer problems, but they make the easy ones like this a lot less likely.

As an aside rant, I'd like to maul the idiot who decided that strncpy would NOT null-terminate a truncated string. wtf??? Personally, I blindly null-terminate after each call:

strncpy(b, a, bufsiz);
b[bufsiz-1] = '\0';
But I have to go to the manual each time to remind myself which strn calls terminate and which are sloppy. Oh well. Thank goodness for scripting languages. :)

Ethereal and security

Posted Mar 17, 2006 17:00 UTC (Fri) by RobSeace (subscriber, #4435) [Link]

> Of course, the following code, as written, is safe:

Actually, I think you meant to check "strlen (buf) + strlen (s)", not just
the latter by itself... ;-) (Unless, of course, "bufsiz" is already adjusted
to take into account the amount used up by "buf" already...)

> Maintenance is a random, scary process. Occasionally mistakes like this
> will happen.

True... But, the same can happen with lots of other things, as well...
Basically, what you're railing against here is code being modified by
those who aren't intimately familiar with it, which indeed can often cause
bugs to be introduced... But, I don't really see anything too specially
dangerous in regards to sprintf()/strcpy()/etc... The safety check that
someone else mistakenly trashes could instead be that an int value doesn't
exceed some threshold, or that an input field doesn't contain some bad
value, or any number of other things... Basically, modifying code when
you don't fully understand it is almost always a really bad idea, and is
likely to come back and bite you sooner or later... Regardless of WHAT
lib functions are being used by the code in question... I don't really see
any particular functions raising or lowering the threat-level there...

In fact, I can imagine a case where someone just blindly replacing a test
and strcpy()/strcat() like that with a strncpy()/strlcpy() might completely
screw the program up, causing more problems... Maybe there's a very good
reason it wants to do that length check, and actively forbid longer values...
And, most people when they switch to one of the "safe" length-limited
functions just let them silently truncate (otherwise, the code is basically
equally complex as the length-test + "unsafe" function it's replacing,
which makes it kinda pointless)... So, perhaps truncating long data rather
than hard-failing it will introduce some other security problem somewhere
down the line... *shrug*

> As an aside rant, I'd like to maul the idiot who decided that strncpy
> would NOT null-terminate a truncated string. wtf???

Yep, I agree... That was a very stupid design... But, that's looking at
it these days, from the perspective of wanting to use it as a drop-in "safe"
replacement for strcpy(), and I'm not sure it was ever actually designed
with that use in mind... Because, look at its other very bizarre behavior:
if there's extra room in the buffer, then not only does it null-terminate,
but it completely FILLS the rest of the buffer with null chars! Now, that's
a completely unnecessary and inefficient behavior, if it were intended to
just be a "safe" strcpy()... Basically, it seems more like it was intended
to be used for copying data into structs or raw database records or some
other such things, where you have fixed-sized string fields which are not
necessarily null-terminated, but where you want any extra unused space in
the fields to be filled with nulls (for compressability, and/or database
sorting behavior, etc.)... In fact, this exactly describes the situation
at the company I work for, where we use raw C-Tree databases for various
things... We have our own set of 2 functions: one for copying data into a
database field from a normal null-terminated string, and one for copying
data from a database field into a normal null-terminated string... The
former is implemented as just a strncpy() call, since that exactly fits our
needed usage, almost like it was designed for it... However, the latter is
much trickier; we could do it as just strncpy() plus our own nulling, as
you show, but that's really inefficient when the target buffer is larger
than the source data being copied... So, basically, we take the min of the
buffer length (- 1) and the strlen() of the source, and use that as a length
to pass to memcpy(), and then manually null out the destination... Ie: it
basically behaves like strlcpy() does (minus its return value behavior),
which is a much more appropriate behavior for a "safe" strcpy() replacement
than strncpy()'s behavior...

Ethereal and security

Posted Mar 26, 2006 15:59 UTC (Sun) by bronson (subscriber, #4806) [Link]

Your comment on maintenance is a good one. My point was just that by combining the check and copy into a single operation, it becomes harder to misunderstand the intent or break the code. As you say, though, even this provides enough rope to hang yourself if you're not careful.

I'm glad you mentioned that about strncpy. That makes all kinds of sense. Understanding *why* something is the way it is tends to make it a lot easier to live with. If strncpy truly wasn't indented to be a safe replacement for strcpy, I hereby rescind my threat of mauling. :) (shame to give it such a similar name though)

Ethereal and security

Posted Mar 16, 2006 18:58 UTC (Thu) by ekj (subscriber, #1524) [Link]

I don't think dealing with random, possibly even malicious data is anything at all special for Ethereal.

On the contrary, it's a rare program that does *not* have the requirement that it stay safe even when asked to work on malicious data.

This is true for any web-browser, any program that plays any kind of music. Any program that reads any sort of document. Infact any program whatsoever that accepts any kind of input whatsoever.

When I use Gimp, Imagemagic, gqview or Kphotoalbum to deal with images I got from a non-trusted source it's a requirement that they don't blow up, even when faced with a maliciously crafted image. (if it's outside spec, it's perfectly fine if they say: "Invalid jpeg", but they should neither crash, nor expose the account used to run them.

Are protocols *really* that much more diverse and/or complex than say the list of image-formats (and variants) recognized by Gimp and/or the list of document-formats recognized by OpenOffice ?

Ethereal and security

Posted Mar 21, 2006 15:13 UTC (Tue) by mogul (subscriber, #3163) [Link]

The Ethereal protocol display filter list page has close to 41,000 entries: http://www.ethereal.com/docs/dfref

So the answer to your question is a resounding yes.

no grasp of sentence structure

Posted Mar 17, 2006 18:34 UTC (Fri) by vblum (guest, #1151) [Link]

That quoted text snippet is one of the most uninformed comments I have ever seen. I would not be reading lwn if it were not for Jon Corbets extremely clear yet entertaining style of writing. If that style is not professional, I have zero interest in reading any professional material.

no grasp of sentence structure

Posted Mar 18, 2006 0:49 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

The post doesn't give a clue as to what about a grasp of sentence structure is supposedly inconsistent with the LWN article, so that's a mystery.

But it does suggest an explanation for what LWN reported as a paradox: that there were a high density of security holes reported vs a low density of Coverity bugs. That explanation ought to have been in this followup article. It goes like this: Ethereal developers were whipped into action by the high density of security holes reported, so that by the time the Coverity study ran, the bugs were all gone.

explanation of advisories vs bugs

Posted Mar 20, 2006 1:34 UTC (Mon) by man_ls (subscriber, #15091) [Link]

The explanation is in the article, more or less clearly: "The developers, it seems, are putting some significant effort into finding problems before hostile outsiders do", "But the Ethereal developers appear to have a handle on the problem and are trying to do the right things." Perhaps it should be spelt out more clearly.

Code generation == compilation; plugins and security

Posted Mar 22, 2006 9:54 UTC (Wed) by wingo (subscriber, #26929) [Link]

"Automatic code generation" is a fancy phrase for compilation. It entails writing protocol dissectors in a higher-level language, one that is not C. A poignant demonstration that C is an inappropriate language for dealing with untrusted data streams.

It can be done, of course, but the code needs to be written by someone with an understanding of its pitfalls. In a plugin-based data stream processor like ethereal, this will never be the case: contributors unfamiliar with the code base will come, add support for their protocol, then leave. Core developers don't want to reject these contributions -- "only people that want to use this will be using it" -- but every dissector adds a new attack vector.

The same pattern exists in media players/frameworks: xine, mplayer, gstreamer, ... (This theory doesn't however explain the plethora of bugs in image parsers -- PNG is not exactly a rare format.)

Code generation == compilation; plugins and security

Posted Mar 23, 2006 18:57 UTC (Thu) by flewellyn (subscriber, #5047) [Link]

No, I think what they mean by "code generation" is something more akin to Lisp macros, but since it's in C, they have to use another description language, and translate that into C code.

Those who do not learn from Lisp are doomed to reimplement it poorly.

Code generation == compilation; plugins and security

Posted Mar 24, 2006 9:24 UTC (Fri) by wingo (subscriber, #26929) [Link]

To translate from a higher level language to a lower one is to compile. This is true whether your lower level language is extended via macros to give you a higher level language, or whether it is compiled by a separate program. For example, Chicken is a scheme to C compiler, not "code generator".

Note that macro evaluation implies compilation for Common Lisp at least, because it happens at read-time.

Code generation == compilation; plugins and security

Posted Mar 27, 2006 21:03 UTC (Mon) by devinjones (subscriber, #11272) [Link]

Perhaps they are using "code generator" instead of compiler because they are translating from a functional specification to a procedural language.

Code generation == compilation; plugins and security

Posted Mar 30, 2006 7:51 UTC (Thu) by jmayer (subscriber, #595) [Link]

We currently have code generators for (parts of) the following
description/specification formats:
- ASN.1 (many ITU-Standards, some IETF ones, some others)
- Idl (MS-RPC)
- Many higher level netware protocols
- X11 packets
The first two being the "more important" ones as they generate dissectors
from "standardized" description formats (although we need to supply some
code manually to "aid" the generated code).

Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds