Crash-only software: More than meets the eye
Posted Jul 13, 2006 18:54 UTC (Thu) by
Segora (subscriber, #8209)
Parent article:
Crash-only software: More than meets the eye
Hi,
this made me think of Joe Armstrong's (of Erlang fame) work on fault tolerant systems[1]. The canonical way to make an Erlang/OTP system is to divide it into one or more applications, each of which has a supervisor tree of processes. When a process crashes, the restart strategy determines if only the crashed process is to be restarted, all processes on the same level are restarted, or the supervisor crashes and the fault is propagated upwards, leading to the whole node being restarted via hardware watchdog in the extreme case.
Segora
[1] Making Reliable Distributed Systems in the Presence of Software Errors (2003), http://citeseer.ist.psu.edu/armstrong03making.html
(
Log in to post comments)