LWN.net Logo

LFCS: A $99 supercomputer

By Jake Edge
May 8, 2013

Parallel computing is at least part of the future of computing. When even fairly low-end phones have multiple cores, it is clear where things are headed. In the future, hardware with even more parallelism will be available, but parallel programming is not keeping up. Part of the problem there is the lack of availability of inexpensive, highly parallel hardware—which is exactly what Andreas Olofsson and his team at Adapteva have set out to address.

In his keynote at the 2013 Linux Foundation Collaboration Summit (LFCS), Olofsson described the Parallella—a $99 supercomputer that Adapteva ran a Kickstarter project to design and build. He began by noting that his keynote followed one from Samsung, which is the world's largest electronics company while he was representing "perhaps the smallest" electronics company. Olofsson is not only the CEO, he is also one-third of the engineering team, he said with a chuckle. Despite being so small, the company was able to build a 64-core 28nm chip that runs 100 GigaFLOPS at 2 watts.

[Andreas Olofsson]

Adapteva created that chip two years ago and went around the world trying to sell it, but that effort was met with "crickets", he said. The world just wasn't ready yet, so that's where the Kickstarter idea came from. Adapteva is turning into a systems company because "people want computers, they don't just want chips".

Energy efficiency has not kept up with Moore's law, which creates an "efficiency gap" that impacts both battery powered and plugged-in systems, he said. The gap has only started to widen in the last few years, which is why most people don't care—yet. But it is a problem that is getting worse and worse and will be quite severe by 2020 or 2025. We have to start thinking differently, he said.

The architecture that everyone wants is an infinitely fast single-threaded CPU with infinite memory and infinite bandwidth to that memory, which is, of course, impossible to have. We keep trying new things when we hit various limitations, scaling up the processor frequency, issuing multiple instructions per cycle, adding SIMD (single instruction, multiple data), and eventually adding more cores. "Now what?", he asked. We are "running out of tricks". We have seen this play out in servers and desktops, and are seeing it play out in mobile "as we speak".

When he started the company in 2008, there were a number of chip-design trends that he believed would shape the future of computing: things like power consumption, memory bottlenecks, thermal density, yield issues, software complexity, and Amdahl's law. But we have a very different example of computing that is close to hand: our brains. Whenever you look at the brain, you realize that what we have been designing is so primitive in comparison to what nature has already accomplished, he said. It is massively parallel (billions of neurons), energy efficient (30 watts), heterogeneous (different parts of the brain handle different functions), and robust (losing a small part generally does not shut the whole thing down).

The "practical vision" for today is heterogeneous, Olofsson said. We have system-on-chips (SoCs) that can combine different kinds of functional units, such as "big" CPUs, GPUs, and FPGAs. But, we won't see 1000 ARM cores on a single SoC, he said. What Adapteva has done is to add hundreds, or even thousands, of small RISC CPUs into a SoC with ARM or x86 cores as the main processor.

Programming

The current state of parallel programming is not up to the task of writing the code that will be needed for massively parallel systems of the future. The challenge is to make parallel programming as productive as Java or Python is today, Olofsson said. We should get to the point where GitHub has thousands of projects that are using parallel programming techniques that run on parallel hardware.

To get to this world of parallel ubiquity, the challenges are "immense". The computer ecosystem needs to be rebuilt. Billions of lines of code need to be rewritten, and millions of programmers need to be re-educated. In addition, the computer education curriculum needs to change so that people learn parallel programming in the first year of college—or even high school.

In his mind, there is no question that the future of computing is parallel; "how else are you going to scale?". Where else can you get the next million-times speedup, he asked. But there is a question of "when?" and it is "going to hurt before we get there". There is no reason not to start now, though, which is where some of the ideas behind Parallella come from.

Inspired by the Raspberry Pi, Parallella is a $99 credit-card-sized parallel computer. Adapteva is trying to create a market for its chip, but it is also "trying to do good at the same time". The only way to build up a good-sized community around a device like Parallella is to make it open, "which is hard for a hardware guy", he said. But that makes the platform accessible to interested hackers.

Parallella is cheap and easy to use, Olofsson said. He wishes Adapteva could sell a million of them (like Raspberry Pi), but thinks that's unlikely. If it sold 10,000 and people took them and did innovative things using Parallella, that would be a big success.

The Kickstarter tried to raise $750,000 in 30 days and was successful in doing so. He and his team have spent the last six months trying to deliver what had been promised. The goal was a 5 watt computer that could do 100GFLOPS, which was achieved. There are many different applications for such a device including software defined radio, ray tracing, image processing, robotics, cryptography, gaming, and more.

The system has a dual-core Cortex-A9 processor that runs Linux. It has most of the peripherals you would expect, but it also has a built-in FPGA. That device can be configured to "do anything" and is meant to augment the main processor. In addition, there is Adapteva's Epiphany co-processor, which brings many small RISC cores to the system.

At the time of the talk, it had only been five days since Adapteva had received the boards. Both 16-core and 64-core Epiphany versions were delivered (Olofsson is holding one of each in the photo above). Each consumes around 5 watts and he is "excited to get them in the hands of the right people". By the second day, the team could run "hello world" on the main CPU. Earlier in the day of the talk (day 5), he heard from another member of the team that it could talk to the Epiphany and run longer programs on the A9. Six months ago, he didn't know if you could actually build this type of system in credit card form factor with a $99 price point, but it can be done.

Now Adapteva needs to ship 6300 systems to people who donated to the Kickstarter, which is no easy task. There are some serious logistics to be worked out, because "we want people to be happy" with their boards. Adapteva also wants to give away free kits to universities. Building a sustainable distribution model with less than five people in the company will be challenging. It is running a little late, he said, but will be shipping all of the boards in the (northern hemisphere) summer.

Olofsson concluded with Adapteva's plans after all the shipping and handling: next up is "massive parallelism" with 1024-core Epiphany co-processors. It will be interesting to see what folks will do with that system when it becomes available.


(Log in to post comments)

Bitcoin mining

Posted May 9, 2013 7:25 UTC (Thu) by osma (subscriber, #6912) [Link]

My first thought reading this was, will this be good for bitcoin mining? It is limited by the cost of suitable hardware, and electricity to run it, and this looks promising on both areas. Unfortunately, it seems that dedicated ASICs are still much better: https://bitcointalk.org/index.php?topic=178180.msg1856500...

LFCS: A $99 supercomputer

Posted May 9, 2013 8:17 UTC (Thu) by Riba78 (subscriber, #84615) [Link]

I think what is left unsaid in the hype about Adapteva Epiphany products is that they are not general purpose CPUs. You can't, for example, take Apache source code and compile it for Epiphany and get performance boost.

The Epiphany architecture is more like a GPU; an array of simple cores with small local memories and shared memory bus. To use Epiphany, you need to port your software specifically to that platform (like the PS3 Cell) or use OpenCL (like for GPUs). This makes it usable only for special purpose tasks, that already are designed for DSPs and GPUs; signal processing, rendering, scientific calculations, etc.

For PCs, you already have hundreds of GPU cores available on any modern display card, programmable for computational tasks via OpenCL. Even the mobile sector is getting more GPUs; Tegra4 seems to have 72 core GPU and claims 92GFLOPS performance. It would be interesting to see Parallella benchmarked against Tegra4.

LFCS: A $99 supercomputer

Posted May 9, 2013 13:43 UTC (Thu) by kpfleming (subscriber, #23250) [Link]

That wasn't 'missing' at all, it was the bulk of the content of the post. He said multiple times that software needs to be rewritten to take advantage of wide parallelism, and even that programmers will need to learn new skills (be retrained). Having a $99 board to use to learn such techniques on is a huge step in the right direction.

LFCS: A $99 supercomputer

Posted May 9, 2013 14:00 UTC (Thu) by Funcan (subscriber, #44209) [Link]

What kpfleming said. Think of this like the rasberrypi - you're probably not going to replace all you existing systems with it, but it is fairly cheap, hopefully great for teaching on, and people will find novel uses for it.

LFCS: A $99 supercomputer

Posted May 11, 2013 3:29 UTC (Sat) by Trelane (guest, #56877) [Link]

For teaching supercomputing, the LittleFe cluster design is hard to beat. About 1-2k$, fits in checked luggage, CUDA+OpenMP+MPI.

Of course, this coulsd perhaps seve in a similar system.

LFCS: A $99 supercomputer

Posted May 9, 2013 15:17 UTC (Thu) by pboddie (guest, #50784) [Link]

The Epiphany architecture is more like a GPU; an array of simple cores with small local memories and shared memory bus.

There's no denying that having lots of cores in a single package is going to lead to a certain style of architecture, but could you clarify what you mean by "like a GPU"? Looking at the instruction set for, say, the AMD R700 (PDF), it's quite a different animal from that of the Epiphany. Certainly, the latter has memory instructions that look a lot more general purpose than those of the R700.

LFCS: A $99 supercomputer

Posted May 9, 2013 22:47 UTC (Thu) by Riba78 (subscriber, #84615) [Link]

The point I was trying to make is that "Epiphany is not a general-purpose CPU". The reference manual reads more like a DSP than a CPU. Based on few web searches I noted that Epiphany has no MMU, it has a memory system that begs consideration when programming, it's not self-hosting so it needs to be programmed via a host platform (via OpenCL or cross-GCC).

Epiphany was touted for it's GFLOPS performance, so it's natural to compare it to GPUs that are the other big GFLOPS-muscle. Both Epiphany and most GPUs are programmable via OpenCL and thus can be compared in capabilities and performance via that API. If you can't run comparable loads, then you might as well compare bogomips. If Epiphany could run Linux, we could use the informal benchmark of checking how long it takes to compile the Linux kernel.

Sorry for my dismissive attitude. Parallella is a nice project, but IMHO it's hyped up more than it warrants.

LFCS: A $99 supercomputer

Posted May 11, 2013 2:39 UTC (Sat) by jmorris42 (subscriber, #2203) [Link]

Yup, it is truly the Pi of the parallel world, a product with hype far exceeding it's actual usefulness.

I'd really like to see a benchmark showing the 64 computing cell version to be faster than a cell phone GPU, at some practical real world tasks, before getting excited. 32KiB of RAM per node, subdivided into four 8KiB blocks is going to limit it to very specialized tasks. The total available ram is only 2MiB so forget physics modeling. On the other hand a PC's GPU these days has hundreds to thousands of times that RAM available.

That leaves the power argument. Again, a cell phone can't draw 5W. The battery wouldn't last long enough to be useful since most phones are lucky to have 5WHr of capacity, and that label rating valid if discharging in an hour anyway.

LFCS: A $99 supercomputer

Posted May 9, 2013 15:50 UTC (Thu) by jnareb (subscriber, #46500) [Link]

Is it much cheaper than GPGPU-capable GPU (from nVidia or AMD)?

I wonder if software and tools developed for Tilera64 etc. platforms can be easily ported to Epiphany board...

LFCS: A $99 supercomputer

Posted May 9, 2013 17:40 UTC (Thu) by hamjudo (subscriber, #363) [Link]

At 14 cents per kilowatt hour, it would cost me $6.13 to leave one of these things calculating for a full year. Some graphics cards average well more than 100 watts. 100 watts for a year is $122.64.

On the other hand, it is also slower and has much less RAM than those high power GPGPU-capable GPUs. Each core on a GPU is like a SIMD processor, and does certain classes of operations much much faster than this thing. Each core on these guys is like a little general purpose computer, without enough RAM to run Linux.

LFCS: A $99 *not* supercomputer

Posted May 9, 2013 21:36 UTC (Thu) by daglwn (subscriber, #65432) [Link]

Ok, I work in HPC so I'm admittadly biased, but I also have deep knowledge of the domain.

This hardly qualifies as a supercomputer. It's cool technology, yes, and the price point is right. Massive parallelism *will* have to go mainstream, there's little doubt about that.

But a supercomputer is much more than a chip. It's a system: processor, memory, network, I/O, software. Each of these things is tightly coupled and highly tuned for maximum performance at reasonable (for HPC) prices.

So yes, hooray for Adapteva! But as far as I can tell Andreas never claimed Parallella is a supercomputer and neither should anyone else.

LFCS: A $99 *not* supercomputer

Posted May 9, 2013 22:13 UTC (Thu) by jake (editor, #205) [Link]

> But as far as I can tell Andreas never claimed Parallella is a
> supercomputer and neither should anyone else.

Hmm, that's been said quite a few different places, by Jim Zemlin in the introduction of Andreas for one, as well as:

http://www.kickstarter.com/projects/adapteva/parallella-a...

which is the kickstarter, titled:

Parallella: A Supercomputer For Everyone

I can't say for sure that Andreas wrote that, but one would guess he could at least have put a stop to it if he wanted.

And the board does have:

> processor, memory, network, I/O, software

perhaps not in sufficient quantity/quality for your taste, however.

jake

LFCS: A $99 *not* supercomputer

Posted May 10, 2013 21:40 UTC (Fri) by daglwn (subscriber, #65432) [Link]

> I can't say for sure that Andreas wrote that, but one would guess he could
> at least have put a stop to it if he wanted.

It's too bad if Andreas is pushing this view. It's misleading.

> And the board does have:
> processor, memory, network, I/O, software

Well of course it does. It is a computer system after all.

> perhaps not in sufficient quantity/quality for your taste, however.

It's not about my taste at all. Ask anyone in HPC. The fact that processor, memory, network, I/O and software are robust, tightly coupled and highly tuned is what separates a supercomputer from a cluster and from Parallella.

Just as one example, GigE is nowhere near enough to handle climate simulations, modeling stars and analyzig combustion. It's not just about bandwidth and latency.

LFCS: A $99 *not* supercomputer

Posted May 11, 2013 13:26 UTC (Sat) by nix (subscriber, #2304) [Link]

Of course, there's no way that anything faster than GigE can be used in anything even remotely consumer-grade until it gets a bit cheaper. I considered going to 10GbE on the local net a while back, but looked at the prices, screamed, and decided maybe in ten years or when I win the lottery. Just a simple four-port switch was several thousand pounds!

LFCS: A $99 *not* supercomputer

Posted May 13, 2013 18:08 UTC (Mon) by daglwn (subscriber, #65432) [Link]

"It's not just about bandwidth and latency."

LFCS: A $99 *not* supercomputer

Posted May 11, 2013 3:42 UTC (Sat) by Trelane (guest, #56877) [Link]

It's not going to land on the top 500, no. I dont think that clustered supercomputers are the definition of supercomputer. Personally, I think that the high degree of parallelism warrants the use.

(I also work in HPC :-)

LFCS: A $99 *not* supercomputer

Posted May 13, 2013 18:10 UTC (Mon) by daglwn (subscriber, #65432) [Link]

> Personally, I think that the high degree of parallelism warrants the use.

Then a GPU board is also a supercomputer. Because that's basically what this thing appears to be.

But <fist bump> for HPC work. :)

LFCS: A $99 *not* supercomputer

Posted May 14, 2013 12:28 UTC (Tue) by jzbiciak (✭ supporter ✭, #5246) [Link]

Of course, how are you going to feed a Parallella? Unless I misunderstood the diagram, the pipes into it are way too small. It'll work for exhaustive-search type algorithms with small working sets and high locality, but I'm skeptical it scales to larger problems well.

I work on processors that get pressed into HPC duty, and we see the rest of the system as being as important, or often even more important than the CPUs themselves. It's about machine balance.

(I thought about linking one of our latest chips, but then considered it might be bad form.)

Not really open source

Posted May 9, 2013 22:16 UTC (Thu) by yootis (subscriber, #4762) [Link]

These guys make a big deal out of being open source, but they aren't any more open source than an Intel Core i7. Sure, they use GCC and open tools to compile, but so do ARMs and Intel chips. They could release the design of their chip, but they don't. It isn't open source unless you release your designs.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds