LWN.net Logo

Ten Tips for Building Your First High-Performance Cluster (O'ReillyNet)

O'ReillyNet offers some advice to anybody considering building a Linux cluster. "Using the same hardware for each machine in the cluster will simplify installing and configuring your clusters, since you'll be able to use identical system images on each machine. It will simplify maintaining your cluster since, all of the systems have the same basic configuration. You'll need to stock fewer spare parts and will be able to swap systems in and out of your cluster as needed. But the really big savings will come when you program your cluster; you won't have to code for differences in performance among machines."
(Log in to post comments)

Ten Tips for Building Your First High-Performance Cluster (O'ReillyNet)

Posted Jan 4, 2005 6:02 UTC (Tue) by jd (guest, #26381) [Link]

I sort-of agree with the ten points, except that there are exceptions to all of them. For example, using homogenius hardware makes a lot of sense, in 99% of all cases. In the remaining 1% of cases, you may well find that having some specialized nodes makes sense.

However, the list doesn't go nearly far enough. There are many schemes for creating clusters. BProc, Mosix and OpenMosix all have different strengths and weaknesses. Do you want PVM, MPI 1 or MPI 2? Are you going to use a parallel/clustering language, such as Occam? Are data distribution systems, such as BOINC or COSM of any value?

There are also places where the list is correct, but doesn't list all of the options that you might well see. For example, the list does mention Ethernet (100 megabit and gigabit, but not 10 gigabit, although Linux does support that now). The list doesn't mention Myrinet (a popular option with cluster builders) or SCI (though that is somewhat less used). Not everything traverses Ethernet well, so a careful cluster builder would need to look at what type of networking best fits what they intend to do.

Finally, no sane article on clustering can be complete without a mention of Amdahl's Law, which governs the decrease in efficiency as the number of interconnects increases. There is simply no point in building a cluster so large that you actually lose performance.

Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds