By Nathan Willis
October 24, 2012
Content-centric networking (CCN) is a novel approach to networking
that abstracts away the specifics of the connection, and focuses on
disseminating the content efficiently. This is in contrast to the
connection-oriented approach used in most IP applications, which
requires establishing a channel between two nodes with known
addresses. CCN excels at the comparatively common task of fetching
static documents for multiple end users, which causes significant
strain on the network as it is implemented in the
one-to-one-connection-oriented TCP. The concept has been discussed
for decades, but Palo Alto Research Center (PARC, formerly a
subsidiary of Xerox) is actively developing a real-life implementation
called CCNx, which is usable on
Linux and other UNIX-like systems today.
What centric what?
CCNx is the brainchild of PARC's Van Jacobson, and if anyone is
qualified to rethink core Internet protocols, Jacobson is. Among
other things, he fixed TCP flow control and designed the IP multicast
backbone. CCNx clearly draws on the lessons Jacobson has learned
about network congestion over the years; in a 2006 talk at
Google, he described how the NBC television network was slowed to a
crawl during the Olympics by thousands of web users requesting copies
of the same video clip. The data was identical and there was no
secrecy required; if the backbone of the network could only recognize
that the requests were identical, it could dispense with
retransmitting it from the originating server — and make use of
the existing copies closer to the final hop.
That said, CCNx (and CCN in general) is not a replacement for existing
transport protocols; it is designed to run on top of them, and in fact
to be oblivious as to which mechanisms are used underneath: TCP, UDP,
IP multicast, link-level broadcast, or even point-to-point wireless.
The goal is that a party sends a request for a document out
into the open — with no destination address — and anyone
who hears the request and has a copy of the document can respond to
it. It is irrelevant whether the copy that is eventually returned
originates from disk storage on a server, memory in a gateway router,
or any other source. Naturally, making the network efficient means
that the closest party who both hears the request and has the
document should return it. In practice, CCN expects nodes to
intelligently cache the documents that they route to the end-user
nodes; doing so (and keeping popular documents close to the final hop
of the route) is what prevents congestion.
For the scheme to work, of course, the authenticity of the content must be
verifiable from the data itself. If that property holds, the most
noticeable benefit is that, when popular content is requested by
numerous end-users, there is far less congestion on the network
— ideally no additional congestion, as routers at the edges of
the network retransmit their existing copies of the content, without
even needing to propagate the requests upstream. There are other
benefits as well, such as the fact that participating nodes do not
need static or globally-unique names. This allows low-power sensors
to respond to requests (e.g., "what is the current temperature")
without needing a complete multilayer network stack, and it allows
clients to send such requests without knowing the topology of the
network.
On the flip side, CCN does pack more information into the names and
metadata of documents, incorporating things like versioning and
timestamps. For example, once a server publishes a
document over CCN, it no longer has control over it, because it
propagates across the network. Consequently, all updates to a document
must be issued as superseding publications that can be
identified as updates referring to the original, and that can be
verified as authentic.
CCNx specifics
CCNx tackles both the document-updating question and the
authentication question in its messaging scheme. Nodes ask for
content with an Interest message, in which the only required
field is the name of the desired content (although time-limits,
maximum number of hops, and other fields are available). Such sending
nodes could be either end-user applications making the original
request, or network infrastructure nodes passing along requests they cannot
answer.
A Data message that can be authenticated as consistent with
the original publisher is required to complete the puzzle; however the
original publisher never needs to be made aware of the request.
The Data message includes the requested data plus a
cryptographic signature. The signature is generated against the data
and an information block that contains a time stamp, the digest of the
publisher's public key (which is required for nodes to verify the signature),
and may optionally include other information such as the data type.
Nodes are supposed to check the signatures and discard any content
that fails verification; this "lazy" invalidation is intended to cut
down on spoofing attacks without introducing significant overhead.
That is essentially all there is to CCNx; there are just two message
types. Additional features like encryption and application state
management are left entirely up to the layer above CCNx.
Participating nodes are allowed to shape traffic as best they see
fit. On the application side, that could mean interleaving requests
for chunks of large file downloads with higher-priority requests to
check mail. Because CCNx does not keep persistent connections open
between nodes, Quality of service (QoS) is in the hands of the end-user.
Interestingly enough, CCNx does not impose any restrictions on the
formatting
of the actual document name, other than that it be a sequence of bytes
and be hierarchical. The hierarchical dimension exists to allow
publishers to publish related content using the same prefix. That
could be interpreted as a given prefix representing a directory, or as
a given prefix representing small chunks of a single file that needs
to be reassembled further up in the application stack. The
documentation describes an URL-like syntax for CCNx names of the form
ccnx:/PARC/%00%01%02 and includes some recommended naming
conventions, but they are advisory only. For example, it
suggests using a DNS name for the first component in order to ease the
transition, and it recommends encoding the timestamp as another
component. Although optional, these conventions should allow nodes to
perform efficient matching of content names by comparing the prefixes
and without examining the data itself.
The strategy for running an efficient CCNx node is also left up to the
implementer, although here again the project's documentation includes
recommendations
(under the "CCNx Node Model" sub-heading). The recommendation
includes maintaining a content store (CS) indexed by document name, a
table of unsatisfied Interest requests, and a table of outbound
interfaces on which unsatisfied Interests have been forwarded. It is
anticipated that a node will have multiple options at its disposal for
forwarding Interest messages it cannot fulfill; choosing
which links or routes are best at any given moment allows the node to
be opportunistic.
Running CCNx
The CCNx distribution contains a handful of utilities that allow one
to test CCNx on a single machine or on the local network. The latest
release is 0.6.2, from October 3. It includes C source for both a
simple CCNx forwarder and a content repository, a simple CCNx chat
application written in Java, CCNx plugins for the VLC media player and
Wireshark packet sniffer, and Android versions of the repository and
chat applications. Ubuntu is the only Linux distribution tested, but
the dependencies are lightweight: libcrypto, expat, libpcap, and
libxml2.
With the software built, the first step is to start the CCNx daemon with
bin/ccndstart. This is a script that launches the
ccnd daemon and directs output messages to the terminal,
although you can also monitor its status from
http://localhost:9695 in a web browser. The ccnd daemon is
what passes CCNx messages to other nodes; how it does so depends on
the network transports defined in its configuration. For
testing on one machine, ccnd does not require any configuration;
however, editing the ~/.ccnx/ccnd.conf is required to forward
CCNx requests between machines. The example configuration file is
light on detail; its only example entry is the line
add ccnx:/ccnx.org udp 224.0.23.170 59695
which tells
ccnd to route all
ccnx: URL requests that begin with
ccnx.org to UDP port 59695, over the 224.0.23.170 multicast
address. This address is reserved for CCNx with IANA.
The content repository can be started with the bin/ccnr
binary. It defaults to running the repository on the current
directory, but another location can be specified by setting the
CCNR_DIRECTORY environment variable. Similarly, a name
prefix for the available files can be set using the
CCNR_GLOBAL_PREFIX variable. The repository's other key
settings are configured in the data/policy.xml file, the most
important setting being which prefixes the repository should answer
for. By default, however, this prefix is empty, so the repository
will answer all requests — good for testing, but not terribly
practical for deployment.
The file utilities include the command-line tools ccnls,
ccnputfile, and ccngetfile, as well as the graphical
file browser ccnexplore. Dropping files in and rearranging
them gets old after a few hours, but the chat application and VLC
plugin offer more amusement. Both make it clearer how CCNx's network
abstraction simplifies things from the user's perspective. To join a
chat room, for example, one needs only the name of the room (e.g.,
ccnchat ccnx:/testroom1); the underlying transport and
the network addresses of the participants never factor in.
In that sense, working in CCN is reminiscent of Zeroconf service
discovery, except that there is no discrete discovery step involved.
The long hierarchical document names suggest the route-embedding
features of IPv6 addresses as well; similarly, the ability to retrieve a
valid chunk of data from any source reminds one of Bittorrent. Of
course, it is difficult to assess the congestion-prevention
capabilities of CCN with just one or two machines, but the same would
be true for most traffic-shaping or QoS techniques.
There are still aspects of CCNx that have yet to be
finalized, how to avoid content naming collisions or spoofing for
example. Perhaps the advisory naming conventions will be formalized,
or perhaps if CCNx becomes an IETF standard, other techniques will
arise. Also, CCN offers better aggregate throughput on the
network by answering content requests with a nearby copy of the
document, rather than fetching the original again. The downside is
that publishers generally want to know page view statistics, so some
form of reporting may need to be devised.
In his Google talk, Jacobson described CCN as a different perspective
on how to use the network, rather than as a new suite of protocols.
He compared it to the difference between telephone companies'
circuit-switched networks and the first packet-switching data
networks. The wires and the nodes were the same — the difference is in how the
conversations and connections are expressed. Pessimists are
understandably unhappy with the glacial pace of the IETF or of
widespread IPv6 adoption, and the same people might argue that CCN
will never replace the entrenched protocols like HTTP that dominate
today. Perhaps it will not; it is still intriguing to experiment
with, however, and one should certainly never discount the commercial
Internet players' drive to adopt a new technology when it offers the
prospect of saving them money — which CCN certainly could.
(
Log in to post comments)