Block generation is voluntary, and not expected to be done by the average user. The current client is an early (working) reference client, and the blockchain is bulky and expected to grow larger in the future. The system allows for a "lightweight" client that can verify transactions that are sent to it with a fair degree of certain on it's own without needing the entire blockchain and without the ability to generatate new blocks. The network is expected to stratify into various clients with different goals. Mybitcoin.com is a working example of this, as even a lightweight client isn't neccessary for anyone with cheap and continuous Internet access on their cell phone.
As for the energy costs, that is a self balancing system in ways that are too complex to explain here. Much of the current generation is performed on GPUs in a much more energy efficient manner, but also much is performed by persons with a nominal extra cost for electric heat; such as the example of the lone geek living in Toronto with a small apartment that must be heated with electric resistive heat anyway. As the reward for generation drops, the number of people willing to generate will drop as well; keeping the total energy that the market is willing to consume to support the network balanced. It's not true that all transactions are free, nor will they always be free, and the energy consumption is a direct reflection of the willingness of the userbase to contribute to the clock-cycles to secure the system. Also, as the blockchain grows, so does the total proof-of-work that the blockchain represents; and the computing power required to secure the blockchain grows. Assuming that computing power remains stable (a silly assumption) the need for high difficulty levels to protect the blockchain gradually goes down. Not that I expect that computers will suddenly fail Moore's law, but relative to the computing power available the blockchain security becomes more efficient over time.