Date: Wed, 10 Jun 1998 16:44:13 -0400 (EDT)
From: "Robert G. Brown" <rgb@phy.duke.edu>
To: "beowulf@cesdis1.gsfc.nasa.gov" <beowulf@cesdis.gsfc.nasa.gov>,
Subject: Re: PC Week article on Beowulf
On Tue, 9 Jun 1998, Scott Fraser wrote:
> Well I thought someone would have posted this by now, but since not,
> here goes:
>
> http://www.zdnet.com/pcweek/reviews/0608/08linux.html
As one of the "knowledgeable passers by" (passerbys won't parse,
right:-) I'm cc'ing this to one of the authors as well as the list.
Dear Mr. Chowdhry,
I read with pleasure (and a bit of amusement) that PC Week reviewed
Extreme Linux (and by extension, the much older Beowulf/parallel
virtual machine efforts) in a recent edition. Although overall your
article was interesting and informative, I did have a few observations
to make. Regarding:
.....Extreme
Linux's low $29 price belies the technology's lofty
requirements, and there is no organized method of
technical support for Extreme Linux, leaving users to post
their problems to a newsgroup and hope that some
knowledgeable passerby takes an interest.
At the recent Linux Expo, a pair of college kids proudly announced
that they were preparing to build "the world's slowest Beowulf" out of
a pile of surplused 80486 systems and an ordinary ethernet hub.
Although not strictly true (my first efforts had them beat, many years
ago, as did many others:-) the point is clear -- one can build a
beowulf out of virtually anything one can glue together with some sort
of network, including (in principle) 80386's, since linux WILL run on
an 80386. There is obviously no upper bound -- you get a BETTER
system with high end technology, but virtually anybody with two or
more machines can play just for fun.
The $29 price reflects the reality that everything in the package is
available for free on the Web. The CD is just a particularly
convenient packaging of it whose purchase will support the eventual
commercialization of distributed parallel code.
Finally, note that many of the authors/maintainers of linux and its
associated tools and drivers "passerby" on this (beowulf) list and on
its companion lists, e.g. -- linux-smp, and that few problems are
considered so trivial as to be unworthy of response and help. A
reasonable way to consider the kind of help available via this list is
to imagine the kind of technical support one might get for a Microsoft
product if Microsoft's kernel engineers were all on a technical
mailing list available to a literal world of consumers. (I sometimes
wonder if MS engineers lurk on our lists out of sheer loneliness for
meaningful conversations with their technical peers :-). Don't
underestimate this resource. More Ph.D-level expertise (degree and/or
experience based) in computer science, physics, statistics, quantum
chemistry, whatever, listens in on and participates on these lists than
you can imagine.
Between the beowulf list and the linux-smp list, one can get help on
anything from advice setting up your first beowulf (whether or not it
is based on Red Hat Linux or the very RECENT Extreme Linux CD) to a
technical explanation of some feature deep in the kernel. This kind
of help is as instantly available to some Joe setting up his first
Linux computer for the first time as to somebody struggling with a
truly broken device driver. Linus Torvalds (and friends -- the kernel
is really supported by a distributed team all over the world where
anybody can play -- if they are good enough) actually listens in on
the linux-smp list and certainly responds if a problem relates to the
kernel itself or its ongoing design process.
I really think that you ought to have JOINED this list and and the
linux-smp list and tried them for a while before concluding that
support for either linux itself in general (any distribution), the
particular RH linux distribution you used, or the overall Beowulf
software effort is either disorganized or not effective. Is your point
of reference Microsoft's legendary "organized" technical support...;-)?
Beowulf is a series of kernel modifications and a
message-passing system that allows linear scaling of
computers in a coarse-grained architecture (processors
spread over several computers as opposed to a symmetric
multiprocessing system), connected via a standard
networking medium.
Hmmm, this could have been a bit more technical -- and accurate. For
example, what kernel modifications? Many folks who use linux much
eventually configure and cut kernels with specific device support and
the like, although Red Hat linux manages to avoid the absolute need
for that for most users by the judicious use of modules and GUI-based
configuration software. It certainly isn't a necessity. The simplest
Beowulfs are built out of stock linux (any distribution, not just Red
Hat, or make your own) plus at least one of [PVM,MPI] plus as much
"glue" software as you care to assemble or need for your environment.
A lot of this is collected on the Extreme Linux CD preinstalled for
Red Hat, but ALL of it is available freely in source on the Net.
I also am amused by the bit about "linear scaling of computers...":-) A
lot of discussion on this (the beowulf) list has to do with the
NONlinear scaling that occurs for lots of parallelizable problems
(fortunately not my own;-) -- and how to overcome it. If you have any
doubt concerning the technical level accessible on this list, look
over its archives! I learn something useful from the lists nearly
every week!
Finally, the report card's report card (drum roll):
PROS CONS
Allows the clustering of
computers for significantly
improved performance.
Software needs to be written
specifically for Beowulf; no
organized technical support
available.
USABILITY ...............................
F
CAPABILITY ............................
B
PERFORMANCE .....................
A
INTEROPERABILITY .............
D
MANAGEABILITY ..................
C
Actually, not too unreasonable. I'd give your report card a B-. On
the Pro side, you should have added the phrase (understood, I think,
from the context of the article) "for certain problems". Not all
tasks parallelize. This is not a pointless observation -- one of the
common newbie questions on the linux-smp list is "I have set up my dual
PPro with SMP linux, but everything I run seems to take just as long.
How Come?" Most non-techies, and actually a whole lot of techies,
fail to understand how to differentiate parallelizable and
non-parallelizable tasks AND (for things in the former category)
parallelized code and non-parallelized code. Basically, there are
lots of computer tasks in the parallelizable category, but damn few
(outside of the kernel and device drivers themselves) in the
parallelizable AND parallelized category. Of course, to GET
commercial parallelizable (e.g.-- distributed searches, distributed
sorts, distributed accounting sums) tasks parallelized, one requires a
stable, well-understood, well-supported parallel environment -- one of
several motivations for the Extreme Linux project.
I also think that the "F" you give EL for useability (if you will
permit me a gentle observation) reflects more your ignorance of the
realities of parallel coding than any real problem with Extreme Linux
or the Beowulf concept. Could I ask, what kind of product would you
give an A to? Something that parallelizes code written by a totally
ignorant programmer/user trying to recompile a word processor or some
other arbitrary Von Neumann code written in raw, naked C with all
sorts of pointers, structures, and presumed serial execution? It just
won't work. Parallel programming is DIFFERENT -- if Von Neumann code
is one dimensional, parallel code is multidimensional, with both
spatial dimension (topology and nature of the IPC channels) and
temporal dimension (synchronization of multiple threads of code, load
balancing, and so on). So sure, it requires considerable discipline
and expertise, and through no fault of your own you failed to master
it in time to write code yourself to support an article for a
deadline:-). Nevertheless, it is, really, considerably simpler to
learn to write parallel code (for tasks that can be parallelized) than
it is to learn to write code at all -- PVM and MPI are both fairly
useable >>as parallel support packages go<< and it isn't fair to judge
them in any other context.
I also beg to disagree with the interoperability grade of D. First,
if you give EL this grade because the software on the CD is prebuilt
for linux, then I certainly hope( for fairness' sake) you grade ALL
Microsoft/Windows software products with a D as well, cuz they sure
don't run under Linux... software that runs transparently under more
than one operating environment is by far the rule rather than the
exception.
HOWEVER, the beowulf software that is PACKAGED for linux on the
Extreme Linux CD is precisely that kind of software. It runs on
nearly ALL Unix platforms, and variants run under Windows NT. If
Windows 95 was truly a multiuser, multitasking virtual memory
operating system (which frankly it is not) it would run under it as
well and even WITHOUT this fairly obvious requirement I think there
have been Windows 95 ports of at least PVM. One could easily enough
assemble an "Extreme Solaris" CD, an "Extreme Irix" CD, and so on,
although most folks find it just as easy to grab what they need in
source from the net and build it.
In addition, did you realize that Linux in general, Red Hat Linux in
particular, and Extreme Linux by extension, run on Intel systems, DEC
Alpha systems, Sparc systems (and more -- are there any Mac/Power
PC-based Beowulfs out there? -- I think so:-). Linux/Gnu is one of the
most portable and (hardware) interoperable environments available
today! Marrying the intrinsic >>hardware<< design interoperability of
the parallel support libraries to an operating system/environment
capable of running -- identically -- on more hardware architectures
than any other product is what makes the Beowulf/Extreme Linux concept
so appealing.
Nevertheless, not too shabby a treatment, so definitely a B of some
sort. Consider joining (or sticking around on) the aforementioned
mailing lists; take up linux as a hobby for a while (computer types
often really like it). With any luck, your next Linux articles will
be even more informative and authoritative!
Thanks,
rgb
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu