|OLS 2001 coverage||
The Scyld Beowulf Distribution
Beowulf clusters have been the obvious heirs to the supercomputing throne for years; in no other way can so much raw power be had so cheaply. One of the factors limiting their success, however, is that there is no easy way to set one up. Those wishing to deploy clusters generally end up with a room full of parts and a little label saying "some assembly required." In general, deploying a cluster means having a full-time person around to put it together and keep it running. (Alternatively, off-the-shelf clusters do exist, but they tend to be expensive).
Scyld's approach is to create software which makes it as easy as possible to assemble, manage, and use a Beowulf cluster. That software is then packaged in the form of (yet another) Linux distribution. If you build a cluster with Scyld's package, you get something that gives the appearance of a single machine. Administration is facilitated by a simple graphical interface, and users need never know that they are working on a clustered system.
Developers, however, still must keep clustering very much in mind when they write their code.
The Scyld distribution is based on Red Hat 6.2, for the simple reason that it is the most widely deployed Linux system out there. It still uses the 2.2 kernel ("we need a reliable kernel"), and is, to the greatest extent possible, a generic Linux distribution.
Scyld clusters are built around a master node. The master runs the show, handles all the administrative work, and is the only node that users can actually log into. It is, in fact, the only node with any user accounts at all.
The body of the cluster is made up of slave nodes. Their job is to crank on tasks handed to them by the master. They have minimal filesystems, and can even be diskless. Slaves can be added to (or removed from) the cluster on the fly.
Jobs running on the cluster use the "BProc" system. BProc adds a new kind of "remote fork," which allows a process to easily spawn children on the slave nodes. It works by initializing the process on the master, then dumping out its memory areas, copying them to the designated slave, and starting them up through a special binary file type. The process must be aware of this movement, which can be disruptive at times (i.e. open files are usually closed). Thus, the BProc process migration is not totally transparent, as MOSIX attempts to be; Scyld has traded off some extra programming time for better performance.
For the most part, however, the remote process mechanism works nicely. The processes can be monitored on the master - they show up normally in tools like "ps" and "top." If the parent process is killed on the master (with a ^C interrupt on the keyboard, say), the children all go as well. From the point of view of a user, all of the processes are running on the master.
Another cool feature: the master controller can make use of the "wake on LAN" feature of some network controllers to bring up slave nodes as needed. In a cluster using this feature, computing nodes need to be running (and consuming power) only when there is work for them to do.
Scyld's software is available under a free license, of course. The company
is planning to make its money through the usual combination of box set
sales, consulting services, support services, and training.
Eklektix, Inc. all rights
Linux ® is a registered trademark of Linus Torvalds