LWN.net Logo

Multicores are admission of defeat - and they are here to stay...

Multicores are admission of defeat - and they are here to stay...

Posted Apr 28, 2008 4:46 UTC (Mon) by nevyn (subscriber, #33129)
In reply to: Multicores are admission of defeat - and they are here to stay... by khim
Parent article: Interview with Donald Knuth (InformIT)

Yes, it's going to be really hard to make HW faster in the future. All the HW people are saying that the big obvious gains have been found, and there are no more to come (for serialized instructions). Indeed there hasn't been a must have CPU upgrade in the last year or two, although they have got faster and in some cases the commonality of dual cores has been a boon. However that doesn't mean you can wave a magic wand and say "all software will be multi-threaded, and run correctly".

Now on the other hand we've had the possibility of doing multi-tasking for at least 20 years (via. fork() + large Unix boxes), mmap with a unified cache might be a bit more recent. pthread_create() is a bit more recent still, and having it all be accessible via Linux is even more recent. But it's fair to say that you could "fairly easily" get access to 2 CPU boxes 10 years ago.

But with 10 years lead time the SW has basically made zero progress, it's still just as hard and just as error prone to write C+pthread code. Now maybe, due to dual core by default, there will be some breakthrough in the next 10 years ... or maybe we'll all magically start (re-)writing in erlang/whatever. But personally I doubt it, I find it much easier to believe that the answer from the SW people will be "128 core CPUs are irrelevant, start getting used to things not getting faster".

The combustion engine didn't get significantly faster forever, and the world didn't end ... I imagine the same will be true here.


(Log in to post comments)

128 core CPUs are irrelevant, start getting used to things not getting faster? Wrong answer!

Posted Apr 28, 2008 5:54 UTC (Mon) by khim (subscriber, #9252) [Link]

We had no progress for 50 years with multi-core algorithms because there was no need: hardware people did most of the work. Now they've stopped doing it. And of couse there huge number of programs which can benefit from 128 cores - at least in theory. Which ones will be rewritten depends on speed of said program: sure, you can rewrite ls to be multicore-aware but in real world few cases of ls usage will be accelerated so no, there are nothing to gain, but with convert (from ImageMagick)... it's different story. There are tons of programs which can (and will) be multithreaded and more still which are not really needed but will be used anyway (how many people need games? yet today's GPU is mostly result of this pseudo-need).

It's just for a long time software people had the luxury of faster and faster CPUs every few years and had no real pressing need to use SMP. Today - they are forced to use SMP. Different situation and it'll lead to different outcome.

128 core CPUs are irrelevant, start getting used to things not getting faster? Wrong answer!

Posted Apr 28, 2008 6:44 UTC (Mon) by drag (subscriber, #31333) [Link]

Some things, like graphics, are natural things to parallelize. Your dealing with multiple
tiles in a mpeg video or some number of discrete mathematical functions be used over and over
again on a large number of 3D objects or whatnot. 


then there are other things, like imagemagick you mentioned, that can be setup to run in batch
mode were you have lots of images or calculations to do over and over again on a number of
images or whatnot. With that there probably isn't much need to do any multithreading.. Just
fork it for how many images your working on. Batch programming at it's best.

Even for static renderings with raytracing this is the way to go.. the multitheaded portion
would be relatively small and then just fork it and stitch the images back together after it's
been rendered. 

For lots of this stuff it would make sense to have to do multithreading if your in a
Windows-only world, but with Linux the overhead of fork is much smaller and leads to simpler
and less buggy programs.

And of course there is multitasking and all that. Even then how many things do you have going
on at once? 3 things? 4 things?  How is that going to translate to big savings when your
dealing with a entry level Dell desktop computer with 8-16 cores?

Do you really want to see something with the complexity of Firefox or OpenOffice.org have it's
code base be refactored to support multi threaded programming? They have a tough enough job
now as it is trying to maintain their code base without all that extra overhead. 

How many programs that you use on a daily basis that could benefit from it?

I did see big benefits from moving from One CPU to Two. I always wanted to do that, but I
couldn't justify the expense of a SMP board and such for home use. 

From Two CPUs to Four will probably see some benefits.

For rendering games and doing other stuff integrating the GPU into the CPU die will help. So
your going to take advantage of maybe another 4 cores. 

If people start doing crazy stuff with raytracing and graphics then I can see taking advantage
of up to 16 cores... Plus it'll make video processing much less buggy since we can hopefully
get away from the proprietary hell that Nvidia and (historically) ATI have driven us into.

I donno. 

Intel and friends are talking about _EIGHTY_ cores. (Maybe they'll use 4-8 for normal tasks
and then the rest for graphics and a few specialized cores for giggles and grins?)


It seems that they want programmers to start using parallel processing on even basic tasks. I
think that this is a tall order given that, in general, software is in a such a lousy state as
it is right now. I mean it's not like even very advanced programmers have perfected single
threaded application programming. 

128 core CPUs are irrelevant, start getting used to things not getting faster? Wrong answer!

Posted Apr 28, 2008 7:29 UTC (Mon) by ekj (subscriber, #1524) [Link]

But the interesting question isn't what percentage of -all- loads can be paralellized. The interesting question is what percentage of the loads that MATTER can be parallellized.

The loads that -matter- are those where you would like to do more, but are limited by CPU. Or where you'd like to do what you do today quicker.

Offcourse in principle you -always- want to go quicker, but if "ls" already spends 0.042s waiting for I/O and 0.002s doing CPU-work, then it really is of no practical importance if those 0.002s could be efficiently 128-way parallellized.

I think that -most- loads that matter can be parallellized easily. In some cases TRIVIALLY. Can you give a few examples of real-world cases where waiting for the CPU is a real concern, but the problem is not parallellizable ?

I know that the things where I spend time waiting for my CPU are easily parallellizable:

  • Transcode a movie
  • Encrypt/decrypt large amounts of data.
  • Pack/unpack images (PNG, JPG, Raw)
  • FLAC/Ogg-code wav-files
  • Graphic performance, particularily 3D.
But really, mostly when I'm waiting for my computer, I'm waiting for it to do IO. Not waiting for it to compute. Thus advances in IO, such as seek-less solid-state disc and faster internet-links are much more relevant to me than faster CPUs. I suspect this is true for most peronal computer-users.

The lightspeed-limit bites for IO too offcourse, particularily the type where I'm doing IO off some device in Australia.

Multicores are admission of defeat - and they are here to stay...

Posted Apr 28, 2008 11:28 UTC (Mon) by smitty_one_each (subscriber, #28989) [Link]

It struck me that the distinction between the hardware and the operating system is blurring a
bit.
If the kernel is judiciously sprinkling processes over multiple cores, then it seems hasty to
decry this industry direction, even if the human brain has trouble grasping how to span a
single process across multiple cores: so what?

Multicores with a powerful language rule

Posted Apr 28, 2008 16:08 UTC (Mon) by ncm (subscriber, #165) [Link]

You can say we've made zero progress only if, in fact, you pay no attention to the progress
that has been made.

VSIPL++ transparently parallelizes array and image processing operations, just about
completely hiding its use of MPI libraries underneath.  It scales linearly with the number of
processors.  It does this using the very powerful abstraction mechanisms that only became
widely usable in Standard C++, and are still available only in C++ and in a few researchy
languages like Haskell and OCaml.

The lesson here is that progress on fundamentally difficult problems comes not just from hard
work (and the developers of VSIPL++ implementations have certainly worked hard) but from
fundamentally powerful language mechanisms.  Most problems are easy, and popular scripting
languages make it quick to deal with easy problems, but hard problems fall only to powerful
languages and people skilled in using powerful languages.  (Reader should note that
"object-oriented" features were not especially helpful in solving this problem.)

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds