Under the impression there is a relative dearth of formal testing, more testing will obviously be better. I was assuming testing at the appropriate levels in the stack.
It'd already be great to pick lower-hanging fruit like automatic tests for library/programmer/kernel/userspace/filesystem/device interfaces, which must never change without good reason and when they do break, you jolly-well want to know as soon as possible. Regressions Are Bad. If these tests were put into an installable package, then anyone who wanted to help in the testing effort could run them in their own peculiar environment and have the failures forwarded optionally to a central clearinghouse, like Microsoft does with their system crashes.
Harder tests like response latency under varying loads and configurations, memory management fragmentation problems, etc, necessarily have a symbiotic relationship to the code they are testing, and have to be maintained together with them.