Because they have another option: move a process to another machine. Imagine you start a batch job that requires 10,000 processes. Some of your processes may get dumped on machines running web search. Now imagine one of those search processes gets a bunch of requests and its working set increases - your process may get killed to avoid making web search thrash. The controller for your job can then move that process's work somewhere else. This lets Google take advantage of underused resources on production clusters without disrupting production services.