Dan Kaminsky Discovers Fundamental Issue In DNS: Massive Multivendor Patch Released (Securosis.com) [LWN.net]

Recursive servers, but not proxy servers, affected.

Posted Jul 8, 2008 21:51 UTC (Tue) by endecotp (guest, #36428) [Link] (14 responses)

According to the Debian advisory, bind8 is vulnerable but unfixable, bind9 is vulnerable and
fixed, and PowerDNS, MaraDNS and Unbound are not vulnerable.  The "GNU libc stub resolver",
whatever that is, is also unfixable.

Missing from this list is dnsmasq, which is running on my WRT54G.  My understanding is that
because this is not a _recursive_ DNS server, it doesn't have a problem.  Is that right?  In
that case the suggestion that "all name servers should be patched" is a bit OTT, right?

Recursive servers, but not proxy servers, affected.

Posted Jul 8, 2008 22:40 UTC (Tue) by rfunk (subscriber, #4054) [Link] (1 responses)

As I read it, it's a vulnerability in anything that acts as DNS client software. The fact that the glibc resolver is considered vulnerable indicates to me that it's not just recursive DNS that's a problem.

It appears that BIND 9 was fixed for this problem by implementing UDP source port randomization for the queries.

I've been looking into dnsmasq. It appears to rely on the underlying OS to choose the client port for queries. (Its random-number routine is used to generate a query ID, not a port number.) So I suppose the question then becomes whether Linux randomizes UDP client ports sufficiently and properly to address this issue. an IETF draft says Linux does source-port randomization, but it'd be nice to find a more specific authoritative source.

Recursive servers, but not proxy servers, affected.

Posted Jul 9, 2008 10:59 UTC (Wed) by rberger (guest, #52829) [Link]

Apparently, in terms of a fixed query source port, dnsmasq seems to be vulnerable anyway. I
made some tests on my router - running kernel 2.6.25.10 - and the source port dnsmasq used to
forward requests to my ISP's name servers was all the same for several queries.

As I understand it, you can specify a source port via config or command line, or dnsmasq will
pick one randomly at startup. But once it is chosen, it apparently will use it for all queries
from this point on.

Since it doesn't recurse, dnsmasq won't be top priority I guess, as an attacker would have to
spoof one of the ISP's nameservers, which is much more unlikely than spoofing one of the
servers on a recursive resolution path. So I'd be interested in my ISP getting his servers
straight in the first place.

But it would still be nice if this got fixed some time, given the attention this issue draws.

Recursive servers, but not proxy servers, affected.

Posted Jul 8, 2008 22:45 UTC (Tue) by jebba (guest, #4439) [Link] (2 responses)

It doesn't say anything about djbdns either, except to link to some of his papers written
years back. Some commenter said it wasn't vuln due to port randomization.

Recursive servers, but not proxy servers, affected.

Posted Jul 8, 2008 23:34 UTC (Tue) by rfunk (subscriber, #4054) [Link] (1 responses)

Dan Kaminsky has confirmed that djbdns is not vulnerable due to its source 
port randomization.

Recursive servers, but not proxy servers, affected.

Posted Jul 9, 2008 23:07 UTC (Wed) by jebba (guest, #4439) [Link]

As a side note, djbdns (and related programs) are now all under the public domain (finally!).
I hadn't heard that good news until today. :)

http://cr.yp.to/distributors.html

"What are the distribution terms for djbdns?
2007.12.28: I hereby place the djbdns package (in particular, djbdns-1.05.tar.gz, with MD5
checksum 3147c5cd56832aa3b41955c7a51cbeb2) into the public domain. The package is no longer
copyrighted."

Recursive servers, but not proxy servers, affected.

Posted Jul 8, 2008 23:10 UTC (Tue) by nix (subscriber, #2304) [Link] (7 responses)

The GNU libc stub resolver is the resolver in libc which 
reads /etc/resolv.conf and satisfies requests from almost all actual 
programs for names (the commonest class of programs that don't use the 
libc resolver are those that need asynchronous name resolution). It's 
derived from BIND8 code (IIRC, it may even be late BIND4).

This pretty much means that all Linux systems are vulnerable to cache 
pollution unless they are going via a non-vulnerable caching nameserver 
under control of someone trusted. However, since it's only a stub resolver 
and doesn't do recursive lookups, everyone has to use some caching 
nameserver anyway, simply for performance reasons... so this isn't so 
significant a problem. (adns, fairly widely used as a stub resolver when 
asynchronous name resolution is required, has the same flaw, if flaw it 
is.)

Recursive servers, but not proxy servers, affected.

Posted Jul 9, 2008 12:44 UTC (Wed) by nix (subscriber, #2304) [Link] (2 responses)

Its BIND 8-derived, and as far as I can tell it does source port randomization (at least the
source ports it uses on my system, with a pristine glibc 2.7, are randomized, assuming a
sufficiently recent kernel).

Recursive servers, but not proxy servers, affected.

Posted Jul 10, 2008 12:19 UTC (Thu) by BenHutchings (subscriber, #37955) [Link] (1 responses)

I suspect that the glibc stub resolver justs bind its socket to an unspecified port, which is
fairly random after the system has been running for a while (whereas BIND typically starts
shortly after the machine is booted). But an attacker can find out which source port you're
using if you ever send a query to a DNS server they control. If I understand correctly, the
source port needs to be randomised for each query (i.e. the resolver keeps re-binding to
specified random ports).

Recursive servers, but not proxy servers, affected.

Posted Jul 10, 2008 12:30 UTC (Thu) by nix (subscriber, #2304) [Link]

Aha. It can't persist the socket but has to re-bind(). OK, glibc's not 
doing that.

(Can you tell I've not done much UDP stuff? I love the Internet: I can let 
my ignorance and incompetence hang out for all to see!)

glibc

Posted Jul 9, 2008 18:52 UTC (Wed) by ncm (guest, #165) [Link] (3 responses)

As I recall, GNU libc doesn't cache DNS results, by default.  It can be told to cache results,
but they never expire, and the cache grows without bound.  Perhaps it is the lack of
expiration that makes it unfixable, and perhaps the default mode (and impractical optional
mode) that makes it of little concern. 

My information might be out of date; has it changed since 1999?

glibc

Posted Jul 9, 2008 20:54 UTC (Wed) by nix (subscriber, #2304) [Link] (2 responses)

I can't see any sign that glibc's internal libresolv or nss-dns layer can 
be told to cache anything but DNS server addresses at all. (Of course nscd 
can cache DNS responses, and expire them.)

glibc

Posted Aug 5, 2008 4:26 UTC (Tue) by rickmoen (subscriber, #6943) [Link] (1 responses)

Nathan (ncm) might be thinking of the caching that transpires if you start the nscd
(nameservice 
caching daemon).  It's commonly used on systems with heavyweight lookup regimes (NIS, NIS+, 
LDAP) to prevent system performance from bogging down excessively, by locally caching lookup
of  
hosts, users, groups, services, RPC ports, netgroups, etc. but has the drawback that it
ignores TTL 
values on host lookups.  (That's a sufficient reason to disable host caching in /etc/nscd.conf
.)

The BIND8-derived stub resolver in glibc isn't a huge security risk on most systems despite
its 
haplessly failing to randomise UDP source ports, because the result doesn't get cached.
(Thus, 
sending it poisoned data in an ADDITIONAL SECTION portion of a recursive DNS response doesn't 
do the attacker much good, because the poison gets metaphorically flushed immediately.)  
However, such a system with nscd caching hostnames would have a problem.  (So, Don't Do That, 
Then.)

Rick Moen
rick@linuxmafia.com

glibc

Posted Aug 5, 2008 8:38 UTC (Tue) by nix (subscriber, #2304) [Link]

nscd in glibc 2.8 honours TTLs.

Security updates for embedded boxes

Posted Jul 10, 2008 16:03 UTC (Thu) by Cato (guest, #7643) [Link]

Security updates for embedded systems are poorly managed at present - doesn't matter too much
if it's a DVD player, but now that many embedded devices are Internet connected, it's a real
issue. One example is dnsmasq, which I already have running on my DD-WRT wireless router, but
have now disabled.

Niche distros have this problem a lot - much as I like Damn Small Linux and similar distros,
they don't seem to have any security update policy, and it's hard to know which
vulnerabilities exist. They often run very old software and aren't usually a close derivative
of a mainstream distro, so it's almost certain they have many open vulnerabilities.

Another example is the eee PC - this runs Xandros, which you would think is easy to update
being Debian based, but in practice it seems security updates are missing or very late. One
example is a Samba vulnerability from 2007 that was not patched as of Feb 2008:
http://forum.eeeuser.com/viewtopic.php?id=14237

The general point is: how do you make consumers aware of the need for rock solid security
updates for embedded devices, and thereby cause the vendors to actually bother to implement
this properly? Perhaps a mass of compromised devices due to this DNS cache poisoning issue is
the only way this will happen... Apparently Dan Kaminsky's attack is far more 'point and
click' than previous ones, so in a month or two we can look forward to this being incorporated
in widespread malware and used by botnets.

Maybe this lack of attention to security is simply a sign of an immature market sector - over
time perhaps the standard Linux distros will be ported / adopted, ensuring timely and complete
security updates, but in the mean time Linux on embedded devices may get a bad reputation for
security.

More info in a podcast interview

Posted Jul 8, 2008 23:25 UTC (Tue) by rfunk (subscriber, #4054) [Link]

An interview with Dan Kaminsky has more information.

His simple message is "if it recurses, patch it, but non-recursive clients are also affected as a lesser priority."

"Dan Bernstein completely solved a big security issue he didn't even know about!" with port randomization. But 16-bit randomization isn't enough; they're adding another 11-14 bits of randomization. It also involves the transaction ID.

"Even I gotta admit, maybe there is something to this whole DNSSEC thing...." (but he still isn't saying DNSSEC is workable.)

Dan Kaminsky Discovers Fundamental Issue In DNS: Massive Multivendor Patch Released (Securosis.com)

Posted Jul 9, 2008 1:04 UTC (Wed) by miguelzinho (guest, #40535) [Link] (1 responses)

It seams to me that this problem was addressed before, back in 2007. Although no one has
disclosed how the attack works, but it is very likely that the issue is the same: clients keep
the same source port for every query.

http://www.trusteer.com/bind9dns

Dan Kaminsky Discovers Fundamental Issue In DNS: Massive Multivendor Patch Released (Securosis.com)

Posted Jul 9, 2008 3:05 UTC (Wed) by rfunk (subscriber, #4054) [Link]

Sounds to me like the attack was just theoretical before, but is now 
practical.

Source port UDP randomization

Posted Jul 9, 2008 7:39 UTC (Wed) by mjcox@redhat.com (guest, #31775) [Link] (1 responses)

The upstream kernel got source port UDP randomization (where no port is specified) in 2.6.24.
You can see this in practice by testing distributions like Fedora 8 or 9 where the glibc stub
resolver will use a different source port on each request, therefore mitigating this issue.
Users of older kernels will either need a backported patch to add this functionality, or
changes to glibc if they want UDP source port randomization. 

Upstream commit:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-...

Source port UDP randomization

Posted Jul 9, 2008 12:42 UTC (Wed) by shane (subscriber, #3335) [Link]

BIND uses it's own port selection algorithm. Anyone concerned about portability and security
would too (or would check for port randomness in the configure script).

DNS checker

Posted Jul 9, 2008 13:22 UTC (Wed) by i3839 (guest, #31386) [Link]

Dan Kaminsky's DNS checker can be found on http://doxpara.com/

Issue

Posted Jul 9, 2008 18:58 UTC (Wed) by ncm (guest, #165) [Link] (6 responses)

Why can't we call a flaw a flaw?  "Issue" is corporate-speak, drained of sense.

Yes, I know that the headline comes from the original source.  The guilty party appears to be
Dan Kaminsky himself.  Kudos to Jake for translating correctly to English in the squib text.

Issue

Posted Jul 9, 2008 20:55 UTC (Wed) by nix (subscriber, #2304) [Link]

I have been castigated in the past for calling bugs bugs. Apparently 
customers think our code is bug-free as long as we don't call the, er, 
`issues' they report `bugs', even if they called them exactly that.

(sheesh, ridiculous magical thinking: or, rather, typical bleaching: 
expect `issue' to go this way in ten years or so).

defect / fault / failure

Posted Jul 10, 2008 2:46 UTC (Thu) by zooko (guest, #2589) [Link] (3 responses)

If we're talking about a problem that exists in the design of DNS or
in a particular implementation, we should call it a "defect" (this is
the standard term among the safety engineers for what we programmers
informally call a "bug").  If we're talking about a problem that
exists at run-time, when something goes wrong internally, we should
call it a "fault".  If we're talking about somebody getting ripped off
because they relied on DNS and a criminal exploited DNS, then we
should call it a "failure".

It is useful to make these distinctions among different kinds of
problems, and using the word "defect" instead of "bug" makes it easier
to communicate accurately with safety engineers from other
disciplines.  (Also, as Eric Hughes suggested to me many years ago,
there may be a psychological benefit of this terminology -- a "bug"
sounds like something endogeneous, but a "defect" is clearly the
responsibility of the engineer who designed the system.)


defect -- A flaw in a system or system component that has the potential to cause that system
or component to fail to perform its required function during execution, [Jones - IBM].  	(
Reference : SEI:SE-CMM)

fault -- an incorrect step, process or data definition in a computer program  	( Reference :
ISO/IEC JTC1/SC7:14598-1)

failure -- The inability of a system or component to perform its required functions within
specified performance requirements. [IEEE STD 610.12-1990]  	( Reference : SEI:SE-CMM)

references:

http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=...

http://www.totalmetrics.com/cms/servlet/main?Subject=List...

http://en.wikipedia.org/wiki/Safety_engineering

IEEE 610.12-1990, "IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer
Glossaries"

defect / fault / failure

Posted Jul 10, 2008 5:09 UTC (Thu) by ncm (guest, #165) [Link]

I suppose that leaves "flaw" for cases where you can't or don't want to say which it is.

defect / fault / failure

Posted Jul 11, 2008 16:45 UTC (Fri) by giraffedata (guest, #1954) [Link]

People like to use the term "bug" for its connotation even when its denotation doesn't apply. "Bug" connotes something bad; something that should embarass someone; something that should be fixed. But its traditional denotation is a programming defect, as distinct from a higher level design defect.

Nobody's saying what this DNS issue is, so I don't know if it's a bug or not, but it sounds to me like the code is all functioning as its designers meant it to, and just not meeting the requirements of today's deployments. I would not use the word "bug" myself.

Also, I'm not sure from what I've read that it's universally regarded as a defect. Are there people defending the existing function? If so, it's more objective to call it an issue than a defect.

An issue is not a euphemism for bug, default, fault, etc. An issue is something people are thinking about.

defect / fault / failure

Posted Jul 11, 2008 17:01 UTC (Fri) by giraffedata (guest, #1954) [Link]

fault -- an incorrect step, process or data definition in a computer program ( Reference : ISO/IEC JTC1/SC7:14598-1)

This is a pretty bad definition; it doesn't capture the essential difference between a defect and a fault as the terms are commonly used and as used in the other references.

There's a simple distinction: a defect is a state and a fault is an event. If my email sending program doesn't check the divisor for 0 before dividing by it, that's a defect in the program. Each time I run the program and it crashes because it divides by zero, that's a fault in my sending of the email. If I don't have some way to recover and get the email out anyway, it's also a failure in my sending of the email.

The defect is being patched, the issue remains

Posted Jul 10, 2008 14:02 UTC (Thu) by copsewood (subscriber, #199) [Link]

Dan Kaminsky didn't discover the basic problem with the design of DNS or implementations of
it. This was known about years ago, to the extent DJB was aware of it and worked around it to
make DJBDNS not vulnerable to the same extent other unpatched DNS implementations are.
Kaminsky appears to have discovered an attack which exploits the problem in a more devastating
manner than previously known possible.

The basic defect in DNS isn't solved by the current set of patches. DNS without DNSSEC, even
if further third party cache-poisoning exploits are not discovered, still depends upon trust
in a chain of middlemen DNS caching servers in order to communicate authoritative DNS
information from DNS content servers to clients. So the wider issue of insecurity inherent
within the design of DNS itself remains. The additional entropy provided by these patches
makes a class of technical attacks by outsiders more difficult, but this simply delays the
inevitable need to transition to DNSSEC at some point in the future.