By Jake Edge
September 1, 2010
Steganography is an ancient method of hiding a message in plain sight. In
the digital age, steganography is often associated with hiding data inside
of a binary file, typically using the low bits of an image or audio file in
such a way that the message makes very little difference in the output.
The Collage
project looks to use steganography in conjunction with sites that host lots
of user-generated content to provide a communication channel that resists
censorship.
As the slides
[PDF] and paper
[PDF] from a recent presentation on Collage describe, there are increasing
attempts to censor internet communications. It is not just repressive
regimes that are guilty of such censorship either, as various democratic
governments are trying—sometimes succeeding—to get into the
game. Existing methods to route around things like the "great firewall of
China" rely on using proxies (e.g. Tor) outside of the censorship wall.
But, proxies are relatively easily identified and blocked. Worse yet,
anyone attempting to use one of the proxies can be identified and punished.
By using sites that regular "law abiding" citizens use on a regular basis,
Collage seeks to appear completely innocuous to the censoring devices. The
specific example used is photo-sharing sites like Flickr. Many people
legitimately browse the photos there, so it will be difficult to determine
that a
particular user may be browsing for photos that contain a steganographic
message. In addition, the sheer number of photos stored on the site make
it difficult for the censors to catalog those that may contain a hidden
message.
It is, essentially, a form of "security through obscurity", but one that
can offer a level of deniability if used properly. If a censored user
frequently visited Flickr for photo uploading and browsing, and only
infrequently used it
to pass messages, it would be difficult to detect by anything other than a
targeted monitoring of that user's traffic. Unlike proxies, there is no
need for anyone to maintain an infrastructure of hosts to handle the
traffic; Flickr,
YouTube, and others are already doing so.
The basic idea is that a simple message is encrypted (using some key agreed
upon separately), then broken into pieces, with erasure coding added
so that the
entire message can be re-assembled from just a subset of the pieces. Those
chunks then get steganographically inserted into multiple photos, which are uploaded to a photo-sharing site.
The project also used a text steganography technique to hide messages in
the text of comments on blogs, YouTube, Twitter, and so on. In either
case, the presence of steganography is likely to be detectable if the
censoring agency tries. But with proper encryption, the actual message
text will not be recoverable. The paper also discusses the use of watermarking
to hide information that may be more easily detected but is hard to remove
without disrupting the containing photo or file.
In order for a message to reach its recipient, though, there needs to be
some way for them to know which of the billions of photos at Flickr
actually contain bits of interest. In addition, the downloads made by the
user must appear to be "normal" tasks that a Flickr user might perform.
The paper outlines a rather elaborate protocol that could be used to map
messages to "deniable tasks" that the recipient must perform. It's a
tricky problem as is acknowledged in the paper:
The challenge, of course, is finding sets of tasks that are deniable, yet
focused enough to allow a user to retrieve content in a reasonable amount
of time.
It is a clever technique, but there are, of course, some pitfalls. The
complexity will make it challenging to use, and automated retrievals may be
difficult to do in a non-suspicious manner. It could also end up pointing
a finger at "innocent" users of a site like Flickr, who unwittingly just
happen to perform the task associated with a Collage message. The paper
notes that risk, but also points out that "organizations can already
implicate users with little evidence".
Essentially Collage is a proof-of-concept that uses off-the-shelf free
software to handle the encryption, encoding, and steganography pieces. So far,
the code for a demonstration client, which downloads a message that the
project stored in Flickr, is available. The web site does not specifically
mention further code releases, but one hopes the code for the sending side
will also become available. There are also some performance
measurements in the paper that show "acceptable" overhead for
sending small, textual messages.
The complexity is daunting, but for those who really need to communicate in
a largely deniable fashion, the Collage technique certainly has some
appeal. It doesn't suffer from some of the obvious "red flags" that arise
when using
Tor or normal encrypted traffic (e.g. SSL/TLS, ssh, GPG), which may make it
disappear into the noise of normal network traffic. Collage, or something
like it, may find a
place in the toolkit of those trying to evade internet censorship.
(
Log in to post comments)