LWN.net Logo

Blocking DPI with Dust

By Jake Edge
September 5, 2013

While encrypted communication over the internet is certainly nothing new, recent events have highlighted some good reasons to use it. But protocols using encryption atop TCP or UDP are generally easily identified, such that they can be blocked by governments or ISPs. A new protocol, called Dust [PDF], sets out to provide "blocking resistance", so that commonly used techniques, like blocking based on deep packet inspection (DPI), will be difficult to apply. The overall goal for Dust is to resist censorship in the form of internet blocking.

There are a number of different projects that provide some form of censorship resistance, including document publishing services such as Publius, Tangler, and Mnemosyne [PDF]. But in order to retrieve documents, users must be able to connect to the service, which is easy to thwart via IP address blocking.

So it makes sense to combine anonymous document storage with "hidden services" from an anonymizing proxy like Tor. But those connections are still vulnerable to DPI-based blocking based on the contents of the packets. What is needed, then, is a way to avoid the DPI filters while connecting to the anonymizing proxy. To that end, "the ideal communication protocol is therefore one which is unobservable, meaning that a packet or sequence of packets is indistinguishable from a random packet or random sequence of packets", according to Dust developer Brandon Wiley. Creating that is essentially the design goal for the protocol.

There have been other efforts to create encrypted, censorship-resistant protocols. Wiley's paper mentions several, including Message Stream Encryption (MSE) for BitTorrent, Obfuscated TCP, and Tcpcrypt (which we looked at in 2010). MSE and Tcpcrypt have flaws such as static strings in the handshake or predictability in the packet size that make it easy to detect—and filter—them. Obfuscated TCP has several variants that communicate keys in different ways (e.g. TCP options, HTTP headers, DNS records) all of which can be detected by current DPI filtering.

Key exchange is the most difficult piece of any encryption puzzle. To some extent, Dust punts on that by requiring an "out of band" invitation to be received by a client before it can connect to the server. The invitation has the IP address, port, and public key for the server, along with an invitation-specific password and invitation ID, all of which is encrypted using the password. The invitation ID is a random, single-use identifier that the server can use to determine which invitation (and thus which password) is being used when the client introduces itself with the invitation.

The actual invitation is of no use without the password, so it could be sent via any channel. Because of the encryption, the invitation is "indistinguishable from random bytes". Wiley is focused on automated DPI, so he seems a little cavalier about transmitting the password:

It can then be safely transmitted, along with the password, over an out-of-band channel such as email [or] instant messaging. It will not be susceptible to the attacks which block email communication containing IP addresses because only the password is transmitted unencrypted. If the invitation channel is under observation by the attacker, and only in the case that the attacker is specifically attempting to filter Dust packets, then the password should be sent by another channel that, while it can still be observed by the attacker, should be uncorrelated with the invitation channel.

With an invitation and password in hand, a client can connect to the server by sending an introduction (or intro) packet to the server. The intro packet is prepended with the invitation ID (which is random). The rest of the packet is encrypted with the password and contains the client's public key. When the server receives a packet from an unknown host, it assumes that the first 32 bytes are the ID and tries to look up the password based on that. It then decrypts the rest of the packet and stores the IP address, port, and public key.

At that point, the handshake is complete. Both server and client can compute shared session keys using each other's public key and the password so that they can exchange encrypted messages from then on. That is done using the data packet, which is the third packet type (invite and intro are the other two).

There are several other features of the Dust packet format that bear mention. To start with, packets can be chained within a single TCP or UDP packet. Since the client has the server's public key from the invite, it can send both an intro and data packet in a single TCP packet. That may constitute all of what the client wants to say, which is a useful optimization, but also helps protect against inter-packet timing analysis to detect Dust.

The packets are protected with a message authentication code (MAC) and the MAC is calculated using a password-based key derivation function (PBKDF) with a random initialization vector (IV) transmitted with each Dust packet. Both the MAC and IV are sent in the clear; since the IV is a random per-packet value and the MAC is calculated from it, both are effectively random to an observer. In the encrypted portion of the packet, timestamps are included to protect against replay attacks and a random amount of random-padding bytes is added to each packet so that the packet length is unpredictable. As might be obvious, good random number generation is an important part of a Dust implementation.

All of those techniques should make Dust resistant to protocol fingerprinting using DPI. The packets look like random data of random length, which could be almost anything: streaming audio/video, some kind of file transfer, etc. Of course, the connection just immediately starts up in that mode, which might be considered suspicious in and of itself. But the existing blocking typically centers around blacklists of protocols that DPI can detect. Dust will not easily fall prey to that kind of filtering.

A bigger worry is whitelist-oriented filtering. If the DPI filters will only allow recognized protocols through, Dust will clearly fail the test. Whitelists can be circumvented using steganography (i.e. by hiding the real message inside a packet of one of the "legal" protocols), but that has its own set of problems. Steganographic techniques may lead to packets that can be more easily fingerprinted and blocked. Whitelists will also be difficult for ISPs or governments to enforce, just from a social point of view.

Code for Dust (in Haskell) can be found at GitHub. More information can be found in the README files there in addition to Wiley's paper.

Overall, Dust is an intriguing idea. It is meant to serve as an underlying protocol for something like Tor (which, in turn, may underlie secure and anonymous document distribution). While it is well-tuned to avoid today's DPI (and other) attacks, one wonders if just random gibberish at the start of a connection will be enough to set off tomorrow's filters. Of course, an internet where all of the data was encrypted would potentially obviate the need for something like Dust. In the meantime, at least, Dust seems worth a look.


(Log in to post comments)

Blocking DPI with Dust

Posted Sep 6, 2013 2:42 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

It seems to me that this would have to hit some critical mass (with some key services being accessible only over Dust) before any leverage the spooks have would be beaten. If Google were to go Dust-only, how long would it be before the NSA knocked on their door to turn it off? If Google or Farcebook were forced offline because of principles (ha!), would people care enough to actually do something or would they just shrug and go back to AltaVista and MySpace?

That said, I would like to see this become prevalent.

Blocking DPI with Dust

Posted Sep 6, 2013 19:19 UTC (Fri) by noxxi (subscriber, #4994) [Link]

I think DPI and IDS are more advanced than the author of DPI assumes, at least in the research. They are not limited to pattern matching or simple statistical analysis. Using machine learning techniques (like n-gram analysis of the data stream) a protocol with too much random data and no detectable signature will probably stand out between the other protocols which have a clearer signature. It might not be able to detect the protocol just based on a single packet, but given enough data in the flow it will detect it.

Blocking DPI with Dust

Posted Sep 12, 2013 21:00 UTC (Thu) by blanu (guest, #92833) [Link]

Hi! I'm the author of Dust. Thanks for your interest in the project.

I'd like to mention two points that might be interesting to readers. First, the paper describes Dust v1, which is obfuscation only (randomize everything). Dust v2 (no paper yet) includes a shaping layer which makes the encoded message resemble a given protocol. So you can make the traffic look like HTTP, SMTP, NTP, etc. rather than purely random.

Also, I should point out that Dust is designed to defeat actual filters which are currently, not the filtering techniques which you find in the CS literature that use very advanced techniques which are not actually deployed in the field. A major component of my research is determining what filtering techniques are actually used and what the minimum intervention required is to circumvent them.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds