User: Password:
Subscribe / Log in / New account

More serious in some locales

More serious in some locales

Posted Jun 23, 2011 5:57 UTC (Thu) by Cato (subscriber, #7643)
Parent article: A hole in crypt_blowfish

Some locales such as Russia often use character sets in which the Cyrillic characters are all high-bit-set - KOI8-R is a common example:

This makes the bug much more serious for such users than in locales where only a few accented characters are normally used - anyone setting a Cyrillic-only password in KOI8-R may find their password maps to the same hash as a huge class of other Cyrillic-only passwords.

(Log in to post comments)

More serious in some locales

Posted Jun 24, 2011 0:18 UTC (Fri) by solardiz (subscriber, #35993) [Link]

Here's my analysis for Russian words in koi8-r and utf-8:

Some excerpts:

"Lengths that are _not_ at risk: 1, 2, 4, 6, 8, 10, 12, 14, 16, 18.
The rest are at risk (meaning that 8-bit chars in _some_ positions result in 1 to 3 preceding chars being ignored)."

For 97946 different Russian words, I got "70890 (72%) and 97213 (99%) unique hashes for koi8-r and utf-8, respectively."

"For koi8-r, 22 hashes are seen over 100 times each, with the top one being seen 190 times. For utf-8, the top hash (most common) is seen 4 times, then 84 hashes are seen 3 times each.

Thus, obviously the bug does cause collisions. There are not as many of those as some people might expect for nearly purely 8-bit inputs. Yet the very common hashes for koi8-r are worrisome. Even though if one were to run the entire koi8-r wordlist against a bunch of hashes they'd only achieve a 30% speedup due to the bug, if they focus on words producing 22 top hashes - so they only try 22 words - they'd crack around 3% of passwords based on randomly picked words from that list (assuming uniform distribution of random word numbers). For utf-8, this risk is much lower: trying top 85 passwords (0.087% of candidates) effectively tests 256 of them (0.26%)."

More serious in some locales

Posted Jun 24, 2011 7:48 UTC (Fri) by Cato (subscriber, #7643) [Link]

Thanks for the analysis - not as serious as I thought, but it makes sense that using KOI8-R results in more duplicate hashes than UTF-8.

More serious in some locales

Posted Jun 24, 2011 16:17 UTC (Fri) by kozerog (guest, #75519) [Link]

I'm surprised anyone still uses KOI8-R instead of UTF-8.

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds