By Jake Edge
July 14, 2010
A recent report clearly demonstrates that
computer security is not exempt from the "law of unintended consequences".
As DNSSEC (Domain Name System Security Extensions) is rolled out, we will
likely see various kinds of unanticipated problems in that system which is
meant to secure the internet name resolution process. One of the concerns
about DNSSEC has always been the amount of additional traffic it would
generate, as well as the processing burden on DNS servers—both of
those came into play here.
The report
from Cisco reads a bit like a detective story. A particular DNS server
saw a sudden 2-3x increase in its traffic, which at first glance appeared
to be some kind of denial of service (DoS) attack. Further investigation
showed that the query rate jumped from tens of queries per second (qps) to around
3000 qps. Because it was a DNSSEC signed zone, each query required two
responses—a key resource record (DNSKEY RR) and a signature RR (RRSIG
RR)—totaling more than 1K in size. That led to response traffic of
35 megabits per second (3000+ queries x 1K+ bytes).
When a small amount of data can be sent that generates a much larger
response, there is an "amplification" effect going on. That means that an
attacker can use much less of a resource, bandwidth say, than the victim
must use to respond. So, a small bandwidth investment, spread out over a large number
of attackers (in a DDoS, distributed DoS), can easily cause a victim to generate
more traffic than it can handle. Because DNS typically uses UDP to eliminate
the connection establishment overhead of TCP, it also makes it easier for
attackers, as they don't need to use state-tracking resources on their end.
These kinds
of amplification attacks are well-known for the existing DNS. It is also
understood that DNSSEC adds other amplification possibilities, but those
known cases were not the cause of the problem that Cisco investigated.
In analyzing the data from this event, it was determined that a very small
fraction of clients (1,000 out of 500,000-1,000,000 daily unique clients)
were making
repeated queries for the same DNSKEY RR. It could have been from some kind of
bug in certain DNS clients, but
because the event was so sudden, it suggested that there was some kind of
"external trigger". An obvious trigger would be a change to the DNS
information being served and that was in fact the case: the cryptographic
keys had been changed on the day of the traffic increase.
Normally, a key rollover is handled by keeping both keys around for an
overlap period and signing resource records with both keys during that
time. Either key can be used to verify the signature during the overlap,
and eventually the old key can be deprecated and then removed. Keys are
signed by
a parent server's key (i.e. example.com's key is signed by the
.com
server's key), all the way up to the key used to
sign the root keys. Today, those root keys are often stored locally by the
client as "Trust Anchor". If a client cannot verify a signature with
a key that it has in its cache, it will request a new key from the parent,
because it
assumes the key has changed.
Before it requests a key from the parent, though, it re-requests the key
from the server it is talking to, because it assumes that it is getting
bogus responses from some kind of attack. If it really were an attack,
that would be the right response, but if there were some misconfiguration
on the part of the client, it would just make the problem worse. It turns
out that some clients were distributed with a static set of Trust Anchors.
Once those keys were rolled over, those clients were out of date and could
no longer resolve names associated with those parts of the DNS hierarchy.
But, the amplification turns out to be quite a bit larger than just a
handful of retries for an affected server. When a client cannot verify a
signature, it will do a depth-first search of the alternative name servers,
querying each server to try to find keys that it can use. There are 14
.com name servers, and potentially several name servers for
example.com. This leads to a combinatorial explosion of sorts,
where a query for a single host name (test.example.com for
example) in a simple configuration (two example.com name servers)
leads to 844 separate queries.
Other, much worse, scenarios are described in the report. It is
interesting that perfectly reasonable behavior by clients who have ended up
with outdated information can lead to such a huge increase in the traffic
that DNS servers, especially the root servers, may have to handle. The
conclusion from the Cisco report is certainly eye-opening:
It is an inherent quality of the DNSSEC deployment that in seeking to
prevent lies, an aspect of the stability of the DNS has been weakened. When
a client falls out of synchronization with the current key state of DNSSEC,
it will mistake the current truth for an attempt to insert a lie. The
subsequent efforts of the client to perform a rapid search for what it
believes to be a truthful response could reasonably be construed as a
legitimate response, if indeed this instance was an attack on that
particular client. Indeed, to do otherwise would be to permit the DNS to
remain an untrustable source of information. However, in this situation of
slippage of synchronized key state between client and server, the effect is
both local failure and the generation of excess load on external
servers-and if this situation is allowed to become a common state, it has
the potential to broaden the failure state to a more general DNS service
failure through load saturation of critical DNS servers.
This aspect of a qualitative change of the DNS is unavoidable, and it
places a strong imperative on DNS operations and the community of the 5
million current and uncountable future DNS resolvers to understand that
"set and forget" is not the intended mode of operation of DNSSEC-equipped
clients.
The last paragraph is particularly worrisome. One would guess that a few
years down the road, most clients will be DNSSEC-equipped. And most will
be in the hands of users who know nothing about key rollover,
amplification, DoS, or, for that matter, DNS or DNSSEC. It will be up to
the vendors and distributors to ensure that the "forget" part of "set and
forget" doesn't
happen. It is not hard to
envision some kind of nasty apocalypse lurking for DNSSEC if that's not the
case.
(
Log in to post comments)