By Jake Edge
August 7, 2013
An attack against encrypted web traffic (i.e. HTTPS) that can reveal
sensitive
information to observers was presented at the
Black Hat
security conference. The vulnerability is not any kind of
actual decryption of HTTPS traffic, but can nevertheless determine whether
certain data is present in the page source. That
data might include email addresses, security tokens, account numbers, or
other potentially sensitive items.
The attack uses a modification of the CRIME
(compression ratio info-leak made easy) technique, but instead of targeting
browser
cookies, the new attack focuses on the pages served from the web server
side. Dubbed BREACH
(browser reconnaissance and exfiltration via adaptive compression of
hypertext—security researchers are nothing if not inventive with names),
the attack was demonstrated
on August 1. Both CRIME and BREACH require that the session use
compression, but CRIME needs it at the
Transport Layer Security (TLS, formerly Secure Sockets
Layer, SSL) level, while
BREACH only requires the much more common HTTP compression. In both cases,
because the data is
compressed, just comparing
message sizes can reveal important information.
In order to perform the attack, multiple probes need to be sent from a
victim's browser to the web site of interest. That requires that the
victim get infected with some kind
of browser-based malware that can perform the probes. The usual mechanisms
(e.g. email, a compromised web site, or man-in-the-middle) could be used to
install the probe. A
wireless access point and router would be one obvious place to house this
kind of attack as it has the man-in-the-middle position to see the
responses along with the ability to
insert malware into any unencrypted web page visited.
The probes are used as part of an "oracle" attack.
An oracle attack is one where the attacker can send multiple different
requests to the vulnerable software and observe the responses. It is, in
some ways, related to the "chosen plaintext" attack against a cryptography
algorithm. When trying to break a code, arranging for the "enemy" to
encrypt your message in their code can provide a wealth of details about
the algorithm. With computers, it is often the case that
an almost unlimited number of probes can be made and the results analyzed. The
only limit is typically time or bandwidth.
BREACH can only be used against sites that reflect the user input from
requests in their responses. That allows the site to, in effect, become an
oracle. Because the HTTP compression will replace
repeated strings with shorter constructs (as that is the goal of the
compression), a probe response with a (server-reflected) string that
duplicates one
that is already present in the page will elicit a shorter response than a
probe for an unrelated string. Finding that a portion of the
string is present allows the probing
tool to add an additional digit or character to the string, running through
all the possibilities checking for a match.
For data that has a fixed or nearly fixed format (e.g. email
addresses, account numbers, cross-site request forgery tokens), each probe
can try a variant (e.g. "@gmail.com" or "Account number: 1") and compare
the length of the reply to that of one without the probe. Shorter responses
correlate to correct guesses, because the duplicated string gets compressed
out of the response. Correspondingly, longer responses are for incorrect
guesses. It is
reported that 30 seconds is enough time to send enough probes to
essentially brute force
email addresses and other sensitive information.
Unlike CRIME, which can be avoided by disabling TLS
compression, BREACH will be more difficult to deal with. The researchers
behind BREACH list a number of mitigations, starting with
disabling HTTP compression. While that is a complete fix for the problem,
it is impractical for web servers to do so because of the additional
bandwidth it would require. It would also increase page load times.
Perhaps the most practical solution is to rework applications so that user
input is not reflected onto pages with sensitive information. That way,
probing will not be effective, but it does mean a potentially substantial
amount of work on the web application. Other possibilities like
randomizing or masking the sensitive data will also require application rework.
At the web server level, one could potentially add a random amount of data
to responses
(to obscure the length) or rate-limit requests, but both of those are
problematic from a performance perspective.
Over the years, various attacks against HTTPS have been found.
That is to be expected, really, since cryptographic systems always get
weaker over time. There's nothing to indicate that HTTPS is fatally
flawed, though this side-channel attack is fairly potent. With governments
actively collecting traffic—and using malware—it's not much of a
stretch to see the two being combined. Governments don't much like
encryption or anonymity, and flaws like BREACH will unfortunately be available to
help thwart both, now and in the future.
(
Log in to post comments)