|
|
Subscribe / Log in / New account

Distributions

News and Editorials

Building a High-Performance Cluster with Gentoo

April 10, 2007

This article was contributed by Donnie Berkholz

People often laugh off the optimization you gain from compiling your own software with Gentoo Linux. But there is at least one area of Linux that needs to eke out every last bit of performance from hardware: high-performance computing (HPC) clusters. They are the domain of dedicated tweakers, always searching for another 1% increase in performance. If you can increase the speed of your code by 5%, you save a day and a half every month. The amount of work you can accomplish with that extra time really adds up when you consider hundreds or thousands of CPUs. These clusters are the big brothers of that distcc or openMosix setup you have at home, with an entirely new collection of problems.

By using Gentoo, you can optimize compilation to your heart's content without being forced to leave the distribution's packaging system. The Portage package manager supports arbitrary setting of compilation flags and linker flags as well as non-GCC compilers. Fortran may seem like a dead language to many readers, but its use in scientific computing remains vast. Many HPC cluster administrators install multiple Fortran compilers, each with its own strengths and weaknesses, so supporting these compilers within a distribution's packaging system makes the admin's job significantly easier.

Creating a Gentoo-based cluster is not for the lighthearted, however. Less experienced Linux administrators who don't need to optimize their clusters for speed or size may wish to go with a prepackaged cluster distribution such as OSCAR, Rocks, or Warewulf. But if you need to get the most from your hardware, if you want to minimize your on-disk profile by leaving out useless features and packages, or if you enjoy the easy maintenance Portage provides, then use Gentoo. I founded the Gentoo Cluster Project four years ago to make Gentoo better for clustering by creating a community of cluster administrators and writing documentation to help those new to Gentoo or new to clusters. A major trade-off of using Gentoo rather than a prepackaged cluster distribution, in my mind, is increased initial set-up time but ongoing ease of administration. This is the same trade-off you will find in going with diskless rather than diskful clusters.

Gentoo's flexibility as a metadistribution means you can make whatever you want from it without hacking and slashing all over the place, as you may need to if starting from another distribution. Your changes to the base configuration are easy to find, document, and reproduce. You can even start out with something more minimal than a Gentoo base system by taking advantage of Portage's ROOT support to install only what you need to an arbitrary location (described in more detail in this LWN article). I find this most useful for diskless clusters. You can easily install to a location on an NFS server such as /opt/cluster/, which the diskless nodes use as their filesystem root. By using UnionFS to mount a read-only NFS root with tmpfs layered on top, all of the nodes can use the same filesystem without any concerns about multiple simultaneous writes. You can push only security fixes using `glsa-check`, and with a single invocation of `emerge`, you can manage full system updates to the server root or the diskless root.

Diskful clusters can also benefit from Gentoo. By now, you've probably wondered why anyone would use Gentoo on a diskful cluster, because they would need to compile every package on all of these hundreds of machines. But that isn't the case. Portage supports use of a binary package server, so you can compile packages just once per architecture rather than once per machine. For a serious cluster, you may wish to create more finely grained packages, however, based on the roles of machines within the cluster. File servers require a different set of features (USE flags, in Gentoo) than compute nodes, and they may even benefit from a different set of compilation flags, for example to produce smaller binaries and thus lower disk I/O.

Now you've learned a little about the basic idea behind a HPC cluster and how it works on Gentoo, but what about the applications and communications? A big stack of middleware makes it all possible. At the lowest level, all HPC programs have to talk to each other somehow. The dominant standard today is the Message Passing Interface (MPI). HPC programs must be specifically written to use MPI; it is not transparent to the application. MPI implementations are API-compatible, but regretfully, they are not ABI-compatible. Programs must be specially compiled for each MPI implementation they use. As with Fortran compilers, each MPI implementation has its strengths and weaknesses. One popular, "new" implementation is Open MPI. It's a merger of three existing implementations: FT-MPI, LA-MPI, and LAM/MPI. The other most popular, open-source implementation is MPICH2. Both projects are under active development, so testing them with your workloads is a requirement if you must choose one.

On the level above these custom-written applications sits a batching system such as Torque. This is where users send their computing jobs, and it takes care of the details of when and how to run these jobs. Submitted jobs sit in a queue until their turn, and the batching system can use a number of scheduling algorithms to decide when to run jobs. Sometimes, these simpler batching systems fall short of your needs. That's when you call in the big guns: something like Maui. It's an extremely flexible job scheduler that supports a vast array of scheduling policies, priorities, job reservations, and resource sharing.

At some point, a basic cluster like this will fall short of your needs. You may need to investigate specialized clustering filesystems such as LustreFS or PVFS2, migrate your network to something with better performance than basic Ethernet such as Myrinet or Infiniband, or find another solution to your problem. In clustering, the answer is almost always to benchmark and profile, because the problem is specific to your application rather than being generic to all clusters. Using Gentoo gives you the flexibility and power to make many of these changes while still staying within the Portage package management system.

Comments (33 posted)

New Releases

Debian GNU/Linux 4.0 released

The Debian Etch release has happened. "Using a now fully integrated installation process, Debian GNU/Linux 4.0 comes with out-of-the-box support for encrypted partitions. This release introduces a newly developed graphical frontend to the installation system supporting scripts using composed characters and complex languages; the installation system for Debian GNU/Linux has now been translated to 58 languages." Click below for the announcement.

Full Story (comments: 25)

Debian GNU/Linux 3.1 updated

For those Debian admins who are not yet ready to upgrade to Etch, the Debian Project has released an update to the old stable 3.1 sarge release. "Users who would like to continue using Debian GNU/Linux 3.1 are advised to update their /etc/apt/sources.list network sources to refer to 'sarge' instead of `stable'."

Full Story (comments: 2)

Aurora SPARC Linux Build 2.98 (Beta 1 for 3.0)

The Aurora SPARC Linux project has announced Build 2.98 to the world. This is a BETA release, for what will become 3.0. Some of the features in this release include Fedora Core 6 based tree of packages (some things are newer), support for Niagara hardware (Sun T1000, T2000), gcc-4.1.1, gnome 2.16, KDE 3.5.5, and kernel 2.6.20 (with patches!).

Full Story (comments: none)

Linbox Directory Server 1.1.4 available

The Linbox Directory Server 1.1.4 is now available. Linbox Directory Server is an enterprise directory platform based on LDAP designed to manage identities, access control informations, policies, application settings and user profiles. This version features a Spanish translation, thanks to Alejandro Escobar, and mailbox quota support.

Full Story (comments: none)

Puppy Linux 2.15 CE released (Go2Linux.org)

Go2Linux.org has a release announcement for Puppy Linux 2.15 Community Edition. "The Puppy 2.15CE (Community Edition) is the result of collaboration of a team of Puppy enthusiasts. It is built upon version 2.14 but with many enhancements. In particular the guys have worked on an improved user-interface and nice out-of-the box first impression."

Comments (none posted)

Distribution News

Debian project leader election 2007 results

The results are in for the 2007 Debian project leader election: the winner is Sam Hocevar. See the election page for lots of details.

Full Story (comments: none)

teTeX is gone (from Debian unstable), beware of build-dependencies

The Debian TeX Task force is preparing an upload of TeX Live 2007 to unstable. With this version, teTeX will vanish as a separate package and only continue to exist as transitional packages. "teTeX has been abandoned upstream. TeX Live, which uses most of the scripts developed for teTeX, is its successor in Debian (and elsewhere), and we do not plan to support both systems beyond the lifetime of etch."

Full Story (comments: none)

Fedora Wiki Accounts

Wiki woes have led to the deletion of many Fedora wiki accounts "Those wishing to keep an account should simply sign up again."

Full Story (comments: none)

Mandriva Flash 4GB Released

Mandriva Flash 4GB provides a full-featured system - Mandriva Linux 2007 KDE 32-bit - on a bootable USB 2.0 key. All you have to do is plug in the USB key, turn the PC on and the Mandriva Linux operating system is ready to use in no time, with all you need for office work, Internet and multimedia tasks. System configuration, preferences and data are all saved to the 4GB key.

Full Story (comments: none)

Novell's SLED available on Sun x64 workstations

Novell, Inc. has announced the release of SUSE Linux Enterprise Desktop 10 for the Sun Ultra workstation platform. "The Sun Ultra 20, Ultra 20 M2, Ultra 40 and Ultra 40 M2 Workstations are available with SUSE Linux Enterprise Desktop, certified and supported by Sun. The workstations have been fully tested and YES Certified(TM) to run SUSE Linux Enterprise Desktop, a complete desktop computing solution that dramatically reduces costs, improves end-user security and increases workforce productivity."

Comments (none posted)

Licensing of the Ubuntu Documentation Wiki

New material added to the Ubuntu documentation wiki will be licensed under the Creative Commons license. "This decision is not intended in any way to underestimate the value of contributions, but rather to ensure that the material on the documentation wiki complies with the same standards of openness as the Ubuntu project as a whole."

Full Story (comments: 1)

New Distributions

Linux for Clinics Alpha Release (LinuxMedNews)

LinuxMedNews takes a look at the Linux For Clinics distribution, which has just released an alpha version. "The Linux For Clinics (LFC) Project consists of a team of people who have a common interest in health, medicine, humanity and free and open source software (FOSS). Our team represents a community that shares the common ideals of aiding mankind and treating everyone with respect so that they will treat others in kind. This philosophy is represented by the African word 'UBUNTU' which means 'Humanity Towards Others'."

Comments (none posted)

NixOS

Lambda the Ultimate introduces NixOS, a Linux distribution based on Nix, a purely functional package management system. NixOS is an experiment based on Eelco Dolstra's PhD thesis, The Purely Functional Software Deployment Model. From the Nix home page: "Nix is a purely functional package manager. It allows multiple versions of a package to be installed side-by-side, ensures that dependency specifications are complete, supports atomic upgrades and rollbacks, allows non-root users to install software, and has many other features. It is the basis of the NixOS Linux distribution, but it can be used equally well under other Unix systems."

Comments (1 posted)

Distribution Newsletters

Fedora Weekly News Issue 82

The Fedora Weekly News for April 7, 2007 covers Aurora SPARC Linux Build 2.98 (Beta 1 for 3.0), Seeking reviewers for Summer of Code applications, Fedora Account System Changes, and several other topics.

Full Story (comments: none)

Gentoo Weekly Newsletter

The Gentoo Weekly Newsletter for March 26, 2007 looks at the Developer of the Week (dsd), Gentoo Village at CCC, and several other topics.

Comments (none posted)

Ubuntu Weekly News: Issue #33

The Ubuntu Weekly Newsletter for March 24, 2007 covers Feisty Fawn's beta release, newly approved Ubuntu members, the big effort the "Ubuntu Desktop Effects" team is doing, and all the buzz about Ubuntu going on in the press and the blogosphere, and much more.

Full Story (comments: none)

Ubuntu Weekly News: Issue #35

The Ubuntu Weekly Newsletter for April 8, 2007 is out. This edition looks at Feisty Herd 6 canceled, Feisty Frozen for Release Candidate Preparation, Licensing of the Documentation Wiki Discussed, Launchpad Open for Beta Testing, and several other topics.

Full Story (comments: none)

DistroWatch Weekly, Issue 197

The DistroWatch Weekly for April 9, 2007 is out. "Debian "Etch", the long-awaited release from the largest Linux distribution project that has ever graced the Internet era, finally hit the download mirrors on Easter Sunday and provided some welcome news relief during the otherwise unexciting weekend. But the current string of important releases will not stop here; Mandriva is about to announce a new stable release of its flagship product, Ubuntu is busy preparing its first and only release candidate for "Feisty Fawn", and openSUSE is hard at work in finalising a new alpha release for delivery later this week. In other news, SimplyMEPIS announces its latest and greatest, Samuel Hocevar becomes the new Debian Project Leader, and Arch Linux changes its release policy. Finally, don't miss the third part of our overview of Top Ten Distributions."

Comments (none posted)

Newsletters and articles of interest

The Perfect Setup - Debian Etch (Debian 4.0) (HowtoForge)

HowtoForge has a tutorial demonstrating a server setup on Debian 4.0. "This tutorial shows how to set up a Debian Etch (Debian 4.0) based server that offers all services needed by ISPs and hosters: Apache web server (SSL-capable), Postfix mail server with SMTP-AUTH and TLS, BIND DNS server, Proftpd FTP server, MySQL server, Courier POP3/IMAP, Quota, Firewall, etc. This tutorial is written for the 32-bit version of Debian Etch, but should apply to the 64-bit version with very little modifications as well."

Comments (none posted)

Ubuntu-based Linux Mint tests KDE version (DesktopLinux)

DesktopLinux looks at the Linux Mint KDE edition. "The Ireland-based Linux Mint team yesterday made available the first release candidate of its next version, Linux Mint 2.2 KDE Edition Beta 020. Code-named "Bianca," it uses the KDE 3.5.6 desktop for the first time, running on a 2.6.17-10 kernel, the team said."

Comments (none posted)

Distribution reviews

Dyne:Bolic 2.4.2: A live CD multimedia studio (Linux.com)

Linux.com reviews Dyne:Bolic 2.4.2. "The Dyne:Bolic distribution is a live CD designed for creating, broadcasting, and publishing all kinds of audio, video, and graphic content. It includes some of the best free and open source tools with which you can compose music, mix video streams, and create 3-D animations. Since version 1.4.1, which we reviewed last year, Dyne:Bolic has changed little on the outside. The developers have shuffled the application menu, swapped out some applications, and upgraded all apps to their respective stable versions. The major change is that the 2.x releases are based on a new dyne:II core which has been written from scratch. The new core makes it easier to create new customized versions of Dyne:Bolic."

Comments (none posted)

GoblinX Premium 2007.1 (tuxmachines.org)

TuxMachines.org reviews GobinX 2007.1 Premium. "GoblinX developers released their 2007.1 Premium version of GoblinX Linux recently and I was able to obtain the 1-cd version for testing. GoblinX has always been a very interesting project to watch with their odd-looking almost macabre-themed XFCE distro. It's based on Slackware, so you know they have a good foundation and XFCE is coming into its own. With new versions of GoblinX being released about once per year, it's hard to pass up the chance to test it when a new one arrives on the scene."

Comments (none posted)

I'm JADed ! (Linux Journal)

Dave Phillips reviews JAD, the JackLab Audio Distribution. "The latest JAD is based on the openSUSE 10.2 distribution, which is, according to Wikipedia, "a community project, sponsored by Novell, to develop and maintain a general purpose Linux distribution". SUSE is one of the most popular Linux distributions, with a large community of users and developers primarily based in Europe. However, potential users should have no fear if they don't happen to live in a European country: openSUSE is clearly designed for use anywhere, with full internationalization support."

Comments (1 posted)

Pioneer Linux fails to excite (Linux.com)

Linux.com looks at Kubuntu-based Pioneer Linux. "In November, Techalign released its Pioneer Linux distribution, based on Kubuntu, and available in several paid versions and one free version. I tested the recent Pioneer Linux Basic Release 2 (R2), which is based on Kubuntu Edgy 6.10. Apart from a few minor cosmetic changes and some additional applications, Pioneer isn't very different from a stock Kubuntu."

Comments (none posted)

Red Hat Enterprise Linux 5: Some Assembly Required (eWeek)

eWeek reviews RHEL 5 with an emphasis on virtualization features. "The benefit of using virtualization within general-purpose operating systems is that these products typically offer broader hardware support than do bare-metal or appliance-type virtualization products. The downside is that operating systems, such as RHEL5, tend to offer virtualization services like erector-set pieces - virtualization-savvy OSes can deliver results similar to a product like ESX server, but there's some assembly required."

Comments (4 posted)

A first look at SimplyMEPIS 6.5 (DesktopLinux)

DesktopLinux.com has a review of SimplyMEPIS 6.5 rc2. "SimplyMEPIS 6.5 is built on the 2.6.17 Linux kernel, based on Ubuntu 6.06 LTS (Long Term Service), aka "Dapper Drake," by the way. Until version 6.0, MEPIS had been built on Debian, but MEPIS designer Warren Woodford found that Debian Stable was too far behind the curve, and Debian Testing/Unstable was advancing too quickly and breaking too often, so he switched to Ubuntu. Unlike Ubuntu, which uses GNOME for its default desktop, MEPIS uses KDE 3.5.3." The final release of SimplyMEPIS 6.5 is now out.

Comments (none posted)

Page editor: Rebecca Sobol
Next page: Development>>


Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds