Guarding personally identifiable information

Posted Jun 9, 2017 8:49 UTC (Fri) by tialaramex (subscriber, #21167)
In reply to: Guarding personally identifiable information by bronson
Parent article: Guarding personally identifiable information

The article talks about essentially the encryption treadmill, this idea that incremental advances will inevitably obsolete your encryption. But it's not really like that. The main reason crypto people assume secrets have a finite lifetime is that secrets are kept by people, and people leak - regardless of the technology involved. In practice a finite lifetime is acceptable.

Mechanically we're probably past the point where you can expect incremental technical advances to have any effect on symmetric encryption. DES key lengths were already too short in 1975, you should be able to find contemporary writing that backs up this criticism. Sure enough the only attack that's actually been successfully used on DES is a brute force attack on the key. Rijndael increases the minimum key size to 128-bits, which puts a brute force attack likely permanently out of reach, but even the original DES algorithm - in the back-to-back-to-back 3DES construction so as to use longer keys - is still safe today if you don't mind it being slow and awkward.

We have tended to abandon encryption algorithms once someone demonstrates that even in theory they can sometimes successfully break it with practical resources. In contrast anonymization techniques are _always_ theoretically broken, it's just that sometimes nobody bothers to break them in practice. If we're going to compare to something, how about the Yale lock on a farmhouse back door. Probably the door isn't locked anyway, and if it is anyone who spends a few minutes learning how online can break the lock. That's where we are with anonymization. Our best hope is that nobody even _wants_ to de-anonymize the data, not that they can't.

Guarding personally identifiable information

Posted Jun 15, 2017 4:31 UTC (Thu) by bronson (subscriber, #4806) [Link] (3 responses)

With 200 GPUs, DES is brute forced in a matter of days. And 200 GPUs are available today on AWS. Papers describing this sort of attack go back at least 8 years. It's folly to claim that DES is still safe.

Here is a nice picture of the treadmill, for hash functions anyway: http://valerieaurora.org/hash.html

Guarding personally identifiable information

Posted Jun 26, 2017 14:42 UTC (Mon) by tialaramex (subscriber, #21167) [Link] (2 responses)

The key is in your first sentence. DES has to be _brute forced_. My comment already explained that "DES key lengths were already too short in 1975". The papers go back a LOT further than eight years, the EFF DES Cracker is _last century_. But what the papers don't do is break DES algorithmically, the algorithm is still, decades later, working exactly as intended, you can't find out what the message is without just trying all the keys.

And _that_ is why, again exactly as I wrote, 3DES is still safe. It's not a new algorithm using the same name, it's just three lots of DES because the algorithm is still fine, specifically 3DES is E(key3,(D(key2,(E(key1, message))) so that if you set key1= key2= key3 you get DES as before, but if you set them differently the attacker must either brute force all 168 bits of key or they must rely on the Meet-in-the-middle attack and do 2^112 operations, which is tighter but still impractical today.

Also Val's page is a member of the set of things which assume relatively short past trends will continue in order to predict the future. Her warning that you should plan on being _able_ to replace the hash in your shiny new thing is a sensible one, but the thing about such trends is that they're only notable while they stay true. Nobody is going to make a web page called "To our astonishment SHA-2 is still fine after 75 years". I always want to point to Disco Stu's graph of Disco record sales here...

And finally, while Val's advice is all very well, probably even _more_ useful would be to take the extra hour and learn more about what these things are. On the other side of the fence lots of effort has gone into making modern algorithms and libraries have fewer "sharp edges", such as SHA-3's elimination of length extension, but the edges are only sharp if you have no idea what these crypto algorithms do and do not promise for you. People writing SHA2(m1) and being surprised an adversary can use that to produce SHA2(m1 | chosen suffix) correctly without knowing m1 are protected by using SHA3() instead where their adversary can't pull that off, but they'd _also_ be protected, and better, by knowing what's going on here so they wouldn't fall for that mistake in the first place.

Guarding personally identifiable information

Posted Jun 27, 2017 11:01 UTC (Tue) by paulj (subscriber, #341) [Link] (1 responses)

V interesting comment. BTW, in the interests of "knowing what's going on here", could you explain how SHA-3 eliminates the 'traditional' crypto-hash extension attack?

Guarding personally identifiable information

Posted Jun 27, 2017 11:50 UTC (Tue) by anselm (subscriber, #2796) [Link]

With MD5 or SHA-1, the hash is basically identical to the internal state of the hash function after it has seen the input up to that point. You can use this to hash more stuff and the result will be indistinguishable from what would have been returned if you had applied SHA-1 to the concatenation of the original input (which you technically don't know) and your stuff in one go.

SHA-3 avoids this by using an algorithm where the hash value it outputs doesn't let you infer its internal state. This means that even if you know the hash value, that doesn't tell you everything you need to know to set up your own instance of the hash function after it has seen the material whose hash value you have, so you can perform the extension attack. (For the gory details, check Wikipedia.)