Date: Fri, 10 Apr 1998 04:21:49 -0400 From: Jeffrey Moyer <phro@segfault.res.WPI.NET> To: beowulf@cesdis.gsfc.nasa.gov Subject: transparent process migration Hello again, We have managed to convince our school that implementing process migration on a beowulf cluster would be a good thing. :) We have a slow testbed of machines for proof of concept, and will be starting as soon as possible. The following ideas came up in discussion: o Migrate threads, rather than whole process where applicable o Maddog suggested we create utilities that would allow one to force task migration from one machine, so that one could upgrade hardware w/o taking down the whole cluster o If a node fails, reinitiate it's chunks of the computation on other workstations. Perhaps we can do fault tolerance after all. :) of course this does incur a little more overhead, but that's to be expected. The reason I am posting this, is because most of you have more experience with clusters than myself and my partner, Frank Sweetser. Thus, any input you could give, and any suggestions you might have would be greatly appreciated. Don, I still have all of the email we exchanged about task migration, and that will form the basis of our work to begin with. We look forward to starting this project, and as I mentioned, all input is most welcome. -Jeff