paulej wrote:Yes, mod 62 will create bias. This is something I wrestled with for a bit for SinglePass. I could not find a good solution for that, so I gave up.
For a log-in password a little bias is not a concern. I'm pretty sure fixes exist, but since I'm not that well-versed I'm not sure what they are.
I've read numerous times that bias in a random number generator is definitely a problem, but we're just making log-in passwords with SinglePass.
paulej wrote:I considered an approach where I keep reading more random values in the hash that gets generated. The problem is that there is still the possibility of running out of octets. That would be extremely rare, but it's possible. And the bias is fairly minor. It exists, but only if the octet is >= 248 in value. That results in wrapping back around to 0..7. In my experiments, this often only happens to a single character in a typical password. So out of one 16-octet password, there is usually about 1 character that might see bias. How serious is that?
I understand what you're saying about the possibility of running out of octets, and I think that's one of the standard "fixes" to the bias problem: just keep throwing more digits into the soup (I'm not sure I understand how that fixes the bias, so I need to study it more). To give you more octets to play with I guess an alternative approach might be to use SHA512 or SHA1024 instead of SHA256.
paulej wrote:I also have the same issue with aescrypt_keygen. I have a fix for that in the source repository, but not posted to the web site. For that, I just used an additional two characters: I then selected "%" and "$" to create 64 possible values. Two reasons it is not yet posted are that 1) I'm not sure I like this solution, and 2) I'd like to just integrate the code right into aescrypt itself.
I had previously updated the pwgen C code to avoid modulo bias. Basically, it just discards and value that is higher than acceptable to avoid bias.
The Perl code is not subject to bias due to the random number generator that is used. It generates a big integer (not just values 0..255) and, based on my tests, bias is not visible.
If it's not a problem in pwgen I'm surprised to hear that there's a problem in aescrypt_keygen.
If you're thinking of adding 2 additional characters to round up to 64, what about just using standard base-64 encoding? (But there are a few different "standards".)
These classical problems of modulo bias and whether or not to use base-64 is why I frequently just stick with using hex characters: no ambiguous characters (0 vs O, etc.) in case I'm forced to enter it manually, and it just looks geeky cool. lol. I do essentially what you're doing with SinglePass: do a hash on something and use all or some of the hex characters. Working from the command line makes it quite easy in whatever OS I'm using at the time (home=Linux, work=Windoze).
Sure, it's only 4 bits worth of entropy at most, but since I've gotten myself to a point that some sort of password safe (Keepass or LastPass) is absolutely mandatory for virtually everything I do, entering 24 hex characters through copy-and-paste is just as easy as entering 16 base-64 characters. Same reason for not choosing to use special characters, which I also see you're not a real fan of: it's just not all that necessary.
Wow, sorry if I've gotten too wordy. lol