[LWN Logo]

Date: Fri, 10 Apr 1998 04:21:49 -0400
From: Jeffrey Moyer <phro@segfault.res.WPI.NET>
To: beowulf@cesdis.gsfc.nasa.gov
Subject: transparent process migration

Hello again,

  We have managed to convince our school that implementing process migration
  on a beowulf cluster would be a good thing.  :)  We have a slow testbed
  of machines for proof of concept, and will be starting as soon as
  possible.  The following ideas came up in discussion:

	o  Migrate threads, rather than whole process where applicable
	o  Maddog suggested we create utilities that would allow one to
	   force task migration from one machine, so that one could upgrade
	   hardware w/o taking down the whole cluster
	o  If a node fails, reinitiate it's chunks of the computation on
	   other workstations.  Perhaps we can do fault tolerance after
	   all. :)  of course this does incur a little more overhead, but
	   that's to be expected.

  The reason I am posting this, is because most of you have more experience
  with clusters than myself and my partner, Frank Sweetser.  Thus, any
  input you could give, and any suggestions you might have would be greatly
  appreciated.  Don, I still have all of the email we exchanged about task
  migration, and that will form the basis of our work to begin with.  We
  look forward to starting this project, and as I mentioned, all input is
  most welcome.

	-Jeff