Homomorphic encryption doesn't really line up without how we use data stored "in the cloud". We don't as a rule do much in the way computation on it.
Instead, we break down the task like this:
- We have databases that are very good at two things: storing huge amounts of data reliably, and finding it again - ie searching. This has been true for decades, and looks like it will hold true in the cloud era.
- We have powerful CPU's, that ask the storage engine for a small subset of that data, then display and/or modify it and ask the storage engine to save it again.
If you look at this break down in terms what we have to trust - it turns out we have to trust the thing displaying the data to the human. We don't have a choice in that because humans can't do decryption, so the CPU must do it, so it must inevitably will see the plaintext. So in the end we have to trust it will keep that plaintext private.
What about the storage backends? Do we have to trust them? Well until we starting subcontracting storage out to 3rd parties (ie, in marketing speak place our data in the cloud) we didn't have to think about it, as while we had separated out the storage and compute functions for other reasons it was still all owned by us. But now it's not. But we have continued apply the same level of trust as before, and as a consequence for instance, GMail gets to read all your mail even though all it is doing it is storing it. And DropBox is unencrypted, even though there is no need for it to be that way.
Speaking for myself, I would prefer it wasn't that way. Speaking from the viewpoint of someone selling cloud services, they are probably better off if it wasn't that way. I know my government refuses to give them business because they can see into every piece of data given to them, and I hope my insurance company, my doctor, my tax agent and just about everybody else that stores my private data refuses to deal with them for the same reason.
Having said that, I would love to be able to delegate storage of my data to someone like Google. They are so darned good at it, and the price is absurdly cheap.
OK, so I would love Google to be my storage engine. Remember what the storage engine did - it stored and searched. Sending them encrypted stuff so they don't know what is in it is drop dead easy. But searching the data while it remains encrypted so a suitably small subset can be returned to the CPU we must trust - well that is an interesting problem. Finding exact matches - that's easy. But after thinking about it, I think you need more - you need to be able to select all data that sits in a particular range.
There are some systems seem to say they can do this. I have the papers on my laptop, but I haven't got around to reading with the care required yet.
> Wouldn't a range search be a means of decryption?
Yes, to an extent. But remember the purpose of all this searching is to whittle down the data into something that is small enough can be packaged up and sent to a remote CPU for processing. The purpose is not to "find the exact item". The first step would to not allow the bulk of the data to be search at all. For example, for email, just restricting it to date ranges might be sufficient. Certainly just the making the header data searches would almost all searches I do. And you can use tricks - like returning encrypted rubbish even when the search returns nothing, and restricting the precision of the search - say by ensuring you can't reduce the date range to less that 10 emails.
It seems to me a workable compromise could be reached.
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds