By any reckoning, the ARM architecture is a big success; there are more ARM
processors shipping than any other type. But, despite the talk of
ARM-based server systems over the last few years, most people still do not
take ARM seriously in that role. Jason Taylor, Facebook's Director of
Capacity Engineering & Analysis, came to the 2013 Linaro Connect Asia
event to say that it may be time for that view to change. His talk was an
interesting look into how one large, server-oriented operation thinks ARM
may fit into its data centers.
It should come as a surprise to few readers that Facebook is big. The
company claims 1 billion users across the planet. Over 350 million
photographs are uploaded to Facebook's servers every day; Jason suggested
that, perhaps 25% of all photos taken end up on Facebook. The company's
servers handle 4.2 billion "likes," posts, and comments every day and
vast numbers of users checking in. To be able to handle that kind of load,
Facebook invests a lot of money into its data centers; that, in turn, has
led naturally to a high level of interest in efficiency.
Facebook sees a server rack as its basic unit of computing. Those racks
are populated with five standard types of server; each type is optimized
for the needs of one of the top five users within the company. Basic web
servers offer a lot of CPU power, but not much else, while database servers
are loaded with a lot of memory and large amounts of flash storage capable
of providing high I/O operation rates. "Hadoop" servers offer medium
levels of CPU and memory, but large amounts of rotating storage; "haystack"
servers offer lots of storage and not much of anything else. Finally,
there are "feed" servers with fast CPUs and a lot of memory; they handle
search, advertisements, and related tasks. The fact that these servers run
Linux wasn't really even deemed worth mentioning.
There are clear advantages to focusing on a small set of server types. The
machines become cheaper as a result of volume pricing; they are also easier to
manage and easier to move from one task to another. New servers can be
allocated and placed into service in a matter of hours. On the other hand,
these servers are optimized for specific internal Facebook users; everybody
else just has to make do with servers that might not be ideal for their
needs. Those needs also tend to change over time, but the configuration of
the servers remains fixed. There would be clear value in the creation of a
more flexible alternative.
Facebook's servers are currently all built using large desktop processors
made by Intel and AMD. But, Jason noted, interesting things are happening
in the area of mobile processors. Those processors will cross a couple of
important boundaries in the next year or two: 64-bit versions will be
available, and they will start reaching clock speeds of 2.4 GHz or
so. As a result, he said, it is becoming reasonable to consider the use of
these processors for big, compute-oriented jobs.
That said, there are a couple of significant drawbacks to mobile
processors. The number of instructions executed per clock cycle is still
relatively low, so, even at a high clock rate, mobile processors cannot get
as much computational work done as desktop processors. And that hurts
because processors do not run on their own; they need to be placed in
racks, provided with power supplies, and connected to memory, storage,
networking, and so on. A big processor reduces the relative cost of those
other resources, leading to a more cost-effective package overall. In
other words, the use of "wimpy cores" can triple the other fixed costs
associated with building a complete, working system.
Facebook's solution to this problem is a server board called, for better or
worse, "Group Hug." This design, being put together and published through
Facebook's Open Compute Project,
puts ten ARM processor boards onto a single server board; each processor
has a 1Gb network interface which is aggregated, at the board level, into a
single 10Gb interface. The server boards have no storage or other
peripherals. The result is a server board with far more processors than a
traditional dual-socket board, but with roughly the same computing power as
a server board built with desktop processors.
These ARM server boards can then be used in a related initiative called the
"disaggregated rack." The problem Facebook is trying to address here is
the mismatch between available server resources and what a particular task
may need. A particular server may provide just the right amount of RAM,
for example, but the CPU will be idle much of the time, leading to wasted
resources. Over time, that task's CPU needs might grow, to the point that,
eventually, the CPU power on its servers may be inadequate, slowing things
down overall. With Facebook's current server architecture, it is hard to
keep up with the changing needs of this kind of task.
In a disaggregated rack, the resources required by a computational task are
split apart and provided at the rack level. CPU power is provided by boxes
with processors and little else — ARM-based "Group Hug" boards, for
example. Other boxes in the rack may provide RAM (in the form of a simple
key/value database service), high-speed storage (lots of flash), or
high-capacity storage in the form of a pile of rotating drives. Each rack
can be configured differently, depending on a specific task's needs. A
rack dedicated to the new "graph search" feature will have a lot of compute
servers and flash servers, but not much storage. A photo-serving rack,
instead, will be dominated by rotating storage. As needs change, the
configuration of the rack can change with it.
All of this has become possible because the speed of network interfaces has
increased considerably. With networking speeds up to 100Gb/sec within the
rack, the local bandwidth begins to look nearly infinite, and the network
can become the backplane for computers built at a higher level. The result
is a high-performance computing architecture that allows systems to be
precisely tuned to specific needs and allows individual components to be
depreciated (and upgraded) on independent schedules.
Interestingly, Jason's talk did not mention power consumption — one of
ARM's biggest advantages — at all. Facebook is almost certainly concerned
about the power costs of its data centers, but Linux-based ARM servers are
apparently of interest mostly because they can offer relatively inexpensive
and flexible computing power. If the disaggregated rack experiment
succeeds, it may well demonstrate one way in which ARM-based servers can take a
significant place in the data center.
[Your editor would like to thank Linaro for travel assistance to attend
to post comments)