You clearly know more about digital cameras than I. The idea is not new, see for example http://www.lavarnd.org/what/process.html . I've never seriously proposed it because I figured there was more too it. That site for example also doesn't consider the issues you raised, it just asserts that by throwing enough SHA-1 hashes and bit-swapping at the problem the result is random. I think he misses the point that the problem is not getting random data, but knowing how random your data is.
For example, if you have 50kB of data that you know has 50% entropy you can "distill" it. The trick is knowing the 50%.
You're right you don't need to cover the lens, I left that out for simplicity. I tried this out on a camera being used for another purpose. It had some light and some dark, and the "randomness" was clearly dependant on the light level at that point. But like you say the camera has internal processing (to handle e.g. varying light conditions) and it's hard to know what effect that is having on the output.
Raw output would be best, but I don't know if cheap webcams offer that :)