I think that the key thing is that the current model is based on the premise that the user will only use one application to touch their pictures.
the Grumpy Editor (and many of the rest of us) want to be able to use multiple different apps to deal with the pictures.
we may use Shotwell to view and organize the pictures, but gimp to edit them, (or imagemagic to resize them, or dcraw to convert from raw, etc)
we could let Shotwell import them (and copy them) into it's own structure, and then point these tools at the results, but this will fail as soon as any other tool also wants to take complete control of the pictures, so Shotwell is viewed as being anti-social by doing so.
We understand that this is common practice for the propriatary software world (windows and MAC especially), but the unix philosophy is to allow you to use multiple tools, allowing the user to pick best-of-breed solutions for each task. It's nice if there is one GUI that can leverage the specialized work and be a single panel for doing the common work, but if that prevents other tools from being used, that GUI can easily become more of a liability than an asset.
It's not strictly necessary for all the metadata to be stored in the initial files, but it should be stored in some way that makes it possible for other apps (and/or scripts) to find, access, and ideally modify this data. putting it in exif is one way (and probably the easiest cross-application method), but not the only way.
This also isn't saying that copies of the pictures shouldn't be made. In fact if you modify the image (especially if you modify the image in a way that cannot be perfectly undone) you _should_ make a copy of the image in some manner before you do so.