The old sysadmin adage: "if it ain't broke, don't fix it!" hinges on the assumption that we would know if it's "broke"[sic].
Security issues are the obvious and dangerous counter example to this assumption. If there's a vulnerability then a subtle attacker may be exploiting it for arbitrary lengths of time without exposing the breakage to their victims.
Frequently the sysadmin is also extending this philosophy to the next logical adage ... "better the devil you know ..."
To mitigate the risks posed by upgrading (the upgrade will break stuff) perhaps we'd be better off asking how we can effectively adopt some analog to the "test driven development" model which is the heart of agile programming methodologies.
How could we automate a test suite of the functionality of our systems so that we could deploy a set of changes (upgrades, new package installations, configuration changes, etc) with confidence that nothing (that we tested for) was broken in the process.
First we have to have a way to rollback from our changes.
The old brute force method is to swap out whole (spare) machines or hard drives ... restore the existing system (or replicate it) ... then deploy the changes. There the rollback is to switch back to the primary (non-spare) system or hard drives. (This is a simplification since we also must be aware of the changes that may have occurred to "live" (production) data during the testing --- that can be arbitrarily complicated for specific applications).
It may be possible to use LVM snapshotting as a more elegant and far more lightweight alternative to wholesale drive/system replacement. I would love to see a good HOWTO covering that process.
Next we need a framework for running our tests.
For servers the functionality tests can start with the same tools we use for monitoring our services. So if we have reasonable coverage through things like Nagios then we should be able to add the test system to the monitoring system fairly easily ---- and see alerts for any service that's obviously broken.
However this only tests for obvious breakage. Monitoring systems are designed to and tune to minimize load, for example. So if our system under test has capacity handling issue --- if the upgrades would make our new copy of BIND fall over when all our systems are hammering on it for DNS requests ... or (more likely) our LDAP server upgrades kill the LDAP performance under high load (or truncate bulk queries, or whatever) ... these or things that have to be tested for separately.
So we need a suite of capacity/load tests.
Testing for workstations for user/interactive issues is far trickier.
Ideally we should work with the upstream maintainers to help develop test suites ... those can be used during development, after packaging, and by distribution maintainers to test for integration issues (things that only show up when combining the packages with others in the same distribution) ... and finally these could be packaged up so that savvy sysadmins could re-use them to test for deployment issues.
What I'm proposing is that we build "soup to nuts" testing to catch issues at any stage from development to deployment.