User: Password:
Subscribe / Log in / New account

Building a High-Performance Cluster with Gentoo

Building a High-Performance Cluster with Gentoo

Posted Apr 11, 2007 4:34 UTC (Wed) by njs (guest, #40338)
In reply to: Building a High-Performance Cluster with Gentoo by gdt
Parent article: Building a High-Performance Cluster with Gentoo

There are now plenty of distributions that have excellent incremental upgrade support -- Debian is the classic leader here (and my preference), but it's not unique. AFAIK Red Hat does pretty well these days too. So portage might well beat out RH9 (which is what, 4 years old at this point?), but that's not really saying much.

And, if your criterion is minimizing the risk of upgrades, then a source-based distribution like Gentoo will necessarily be worse than a binary-based one. With a binary-based distribution, everyone is running exactly the same executables, and the chance that you will be the first person to trip over some bug is minimized. With a source-based distribution, it's entirely possible that you are the only person in the world to have packages built with your exact combination of header files, compiler version, and USE and compiler flags -- so even if the bug tracker says that some piece of software has been out for 6 months with no reported problems, that's no guarantee that it'll work for *you*. Of course, you can minimize this by sticking to well-known compiler versions and declining to fiddle with compile flags, but if you're doing that then why bother with a source-based distro at all?

(Log in to post comments)

Building a High-Performance Cluster with Gentoo

Posted Apr 11, 2007 7:14 UTC (Wed) by amacater (subscriber, #790) [Link]

The classic answer on the Beowulf list: It depends. It depends on whether
you admin. your own server or have to rely on central admin. It depends on
the size of your cluster and, more importantly, who your hardware vendors
are. If you buy a 2048 node cluster from IBM, to some exent it's easier to
take the hardware vendor's choice of distro and cluster admin tools. HP's
choice may be different from Penguin's. Two further considerations: fast
interconnect hardware (Quadrics/Mellanox ...) which is an essential for
some classes of problem needs drivers. The companies are relatively small
in terms of staff size and are operating on tight margins in a small
market. It may be that they haven't time to sort out a Debian/Gentoo/Yellow
Dog ... hardware card driver. Lastly, there's the high performance compiler
writers and high-end proprietary software types: they want to debug a known
kernel/memory combination when they get oops reports. You can run highly
successful infrastructures on whichever distribution you like - as ever,
your problem set, resources, time and effort will differ from everyone
else's and, sometimes, it's easier to buy a system off the shelf so that
your users can concentrate on coding and running jobs. Read the Beowulf
list archives for this discussion and minor variants - many times :)

Building a High-Performance Cluster with Gentoo

Posted Apr 11, 2007 16:21 UTC (Wed) by dlang (subscriber, #313) [Link]

useing the approach described above you don't have different systems running different versions of things (unless you want them to). with the binary package server you have one box compile the code with the optimizations that you want, and then it makes the results available to all the other systems (assuming that they are identical)

I haven't done head-to-head performance comparisons with gentoo, but I have seen cases where optimizing the kernel could result in 20-30% performance improvements in the past (back in the 1GHz athlon days). on modern 64 bit hardware it's less of an issue becouse there's less variability between hardware, and therefor less difference betwen optimized versions and the generic versions.

where I actually see the benifit of gentoo where I use it (my home server) is in the ability to configure the packages with the options and dependancies that I want them to have (this means turning on some that other distros would leave off, but mostly turning off options that other distros turn on, but I don't care about)

Building a High-Performance Cluster with Gentoo

Posted Apr 12, 2007 6:29 UTC (Thu) by njs (guest, #40338) [Link]

>using the approach described above you don't have different systems running different versions of things (unless you want them to).

You misunderstand -- the point is that all your systems might be the same, but they'll be different from everyone else's systems. For instance, they will be different from the people who you let upgrade to cool new version of Foobar2000 first, so that they could trip over the nasty bugs and get them fixed before you hit them. (Plus the maintainers tasked with fixing those bugs have a huge combinatorial space of configurations they are trying to support.)

Building a High-Performance Cluster with Gentoo

Posted Apr 12, 2007 13:06 UTC (Thu) by nix (subscriber, #2304) [Link]

Indeed. This is one of the reasons *why* I run bleeding-edge systems on all my systems for which stability is relatively unimportant: specifically so that I can find niggling portability bugs before other people. I find a few a month, typically (sometimes a few a week, sometimes none for a month or two, but the trickle never stops completely).

Building a High-Performance Cluster with Gentoo

Posted Apr 19, 2007 12:35 UTC (Thu) by piggy (subscriber, #18693) [Link]

I would question the claim that a source-based distro necessarily sees a higher risk of encountering obscure and subtle bugs than a binary-based distro. Your reasoning is sound, but my empirical experience suggests that the reverse may be true.

My experience as a developer for a vendor of a binary-only commercial Unix clone demonstrates that the range of strange PC hardware out there is more than sufficient to exercise plenty of unique corner cases.

My other stint of experience comes from working for an embedded Linux vendor. We saw a LOT more trouble from people trying to piece together tiny distributions from prebuilt binaries (even all from the same source) than from people willing to build everything from source. A very common problem we saw with people who tried to do all of their system work with binaries only was subtle version dependencies among libraries as people upgraded individual packages over time. These problems simply do not occur if every library is built successively against the existing set of binaries on the system.

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds