HMAC verification allowing faster decipher iterations
Posted: Wed Aug 01, 2012 6:11 pm
One thing I noticed going through the source code. AESCrypt has the ability to quickly verify whether or not the provided password is correct. Now, while being a nice feature, it also presents a significant problem of making the encrypted file far more vulnerable to brute force attack.
Here's why, if the password was NOT hashed with an HMAC inside the encrypted file, decryption software would have to guess the 32byte key, and decrypt the entire file, or a portion of the file, and then "magically determine" the file type to see if it indeed succeed at cracking the key. The software would have to look for dictionary words, etc, to see if it indeed had cracked the file. Given the numerous types of files encrypted with AES, this would take an extraordinary amount of time to do, because, there are potentially thousands of possible file types being encrypted and the number of cycles necessary to determine if the correct password (key via the hash/8192 iterations) has been found would be incredibly high because there would be no way for the decryption software to know whether or not it had guessed the correct password without performing some intelligent analysis on the output.
By storing the password in the file, encrypted/hashed, with an HMAC, yes, you have a fast way of checking whether or not the password is valid, BUT, the MASSIVE DOWNSIDE to this convenience is that it also makes it several thousand times more convenient for the decryption software to guess passwords.
Since the decryption software simply needs to perform the exact same password check that AESCrypt does during a dictionary/brute force/etc attack, there is no need to even look at the ciphertext at all. The decryption software will know immediately within a couple cycles whether or not it has guessed the correct password.
Personally, I would rather enter an incorrect password, and decrypt my aes file into a pile of jibberish, than get an immediate response saying "Bad Password".
Essentially, what you are doing here is enabling a "Quick Validation" of the password for decryption software.
Without this "feature" the decryption software would have to validate the jibberish produced for each possible key tried to validate whether or not the correct password was guessed. The evaluation of this "jibberish" would take an estimated millions of even billions of extra cycles to analyze the output from the guessed decryption key.
I can see the need for the convenience for knowing whether your password is correct or not for website logins, etc, and basic public and non-secure data. But, for personal cryptography files where security is paramount, giving decryption software a major leg up by letting it know right away whether a password is good or bad is a bad idea IMHO.
Personally, I use the Rijndael 256bit 128block to handle encryption/decryption in all of my software for persistent data.
Since my algorithms do not give any sort of easy spot check password/key validation, any decryption software would not only have to guess my key, but, it would then have to figure out what the heck it was looking at after decryption. In fact, most of the data which I encrypt is in essence a result of a binary serialization using a custom serializer, so, even if the attacker was successful at decrypting one of my "Binary Blobs", it would have no clue that it had actually guessed the correct key, because even after decryption, the file still looks like random jibberish.
I truly believe that due to the number of fileformats being encrypted using this software, removing the password check would make it thousands, if not millions of times more secure since the decrypt process would have to somehow identify the resultant decrypted file as a "known file type", or as "something not encrypted anymore".
Being that performing a simple hash/hmac check requires only a few cycles compared to intelligent analysis of decrypted content, you can see my argument here.
It would be extremely easy for me to right a brute force/dictionary attack against the .aes file format since all I would have to do is validate the password, not validate the resultant decrypted file(s). That attack would be very simple and be able to run trillions of iterations compared to software which not only had to guess the password, but, decrypt part of the cipher text and make an informative decision as to whether or not the decrypted ciphertext was in fact really decrypted.
I know this to be very true because just in my personal experience with my own software, since I don't have a quick and easy password check built into my ciphertext, there have been many times, where I decrypt something with the incorrect password/salt/IV and then the decrypted cipher text simpy doesn't work at all, When I look at the decrypted ciphertext in a hex editor, there is no way for me to tell if it is the original plaintext or not because it's just a binary blob of jibberish. So, with my own AES implementation, a hacker would have an impossible time decrypting one of my files since the binary jibberish before and after the decryption process would look the same.
Also, even if a hacker wrote software to specifically locate certain identifiers within by decrypted serialized blob so that it could "know" whether or not it was successful in decrypting my ciphertext, that would only work on that particular filetype. This, given that I use my AES 256 implementation on a wide variety of filetypes, even with a somewhat simple password, it would be nearly impossible for a hacker to decrypt my information, because of the numerous types of files encrypted.
Personally, if I were in your shoes, instead of offering up a simple way to verify the key to the decryption software, I would use a begin-end obfuscation technique which would be applied to the ciphertext FULLY before encryption/after FULL decryption and would require the entire file to be decrypted in order for the obfuscation software to serialize/deserialize the input/output. This way, after each key is tried, the entire ciphertext would have to be decrypted, then and only then could the obfuscation software be run to restore the original ciphertext.
That way, decryption software would not only have to guess the key,to validate it was working, it would have to completely decrypt the file, then run the obfuscation algorithm to be able to validate whether or not the key was correct. A simple streamed obfuscation software would be enough to make this happen if it required bytes from throughout the file to reproduce the original output. Just an XOR + some magic would be necessary, not much at all.
This way, instead of taking a billionth of a second to validate a key, it would take a second or two to validate the key. So, that would make decryption software work a billion times harder, or even more, to guess a single key.
So, with descent obfuscation on the input (inserting bytes all throughout the file, perhaps randomly, for de-obfuscation) and removal of the super easy password validations, I think it would be absolutely impossible to write a decryption mechanism for a file encrypted in such a way. If you think about it, writing some simple obfuscation mechanism wouldn't be that hard at all. The best part of all, is that by adding a simple obfuscation mechanism, the hacker/cracker would literally need to decrypt the entire file to even begin checking if the key was valid. This would take an eternity per guess compared to the super easy password check mechanism currently built into the software.
I love the software, but, I just think the password check convenience would make any attempts at decryption actually viable in the eyes of the cryptoanalyst. Personally, if someone gave me the format I described above, and told me to write a brute force attack against, I would take a pass. Any government agency, unless completely ordered to, would take a pass at something so futile. But, with the way it is now, I could write a brute force attack in minutes. Cracking the key is made so much easier by the ability to instantly know whether or not I've got the correct key.
Crypto software isn't just about keylength and complexity. Once could have a very simple key, yet with a full file obfuscation technique, it could potentially take an attacker years to guess it because of all the work necessary to check each key.
My personal contribution to cryptography was when I wrote something called WepBrute during the times when AirSnort and Kismet were popular. My approach was much different. Instead of looking for "weak 4's" in a lot of packets, I simply wrote an extremely intelligent dictionary attacker which used combinations of names, dictionary words, words with capitalized first and last letters, Number-Letter substitution, i.e. I looked for 3 instead of E or e in a word, I also would do all of the above with a single non-alpha character at the beginning of the end of a password. So many people do that it's almost comical.
Anyway, using AisSnort I tried cracking several networks with no success because they did not send enough traffic to capture and take advantage of the "weak 4's",
After writing WebBrute, I was able to crack about 80% of these exact same networks in literally seconds. All I needed was a single packet, and I was off and running. It was amazing software. Literally, I would go in for security audits and start reading off the WEP keys within seconds 80% of the time.
Granted, that was also because of the structure of the WEP key system, and how easy it was for a quick password verification against the cipher text. Since AES is not structured in that manner at all, there is no quick way to verify passcodes, except in your implementation.
The software works great, etc, but, I just think you've made it way too easy for crypto people to crack the passcodes, it's like giving them a slice of pie before they even start.
I say get rid of the password validation check and let the crackers suffer and decrypt the entire file with each key and then check the jibberish to see if it recreates a file, then after that, they have to identify if it's a valid file type! Basically, it would be impossible to crack. It would simple take too long. Your talking about a zillion lines of code, not to mention the massive amount of File IO necessary to fully decrypt everytime to check a key. What if it was a RAR file that was a few gigs, it would take forever to check even a single key.
You should at least make it an option for the user. Build it directly into AESCrypt, it would be very very simple to do, you would simply take the password, iterate and hash it 8192 times, then just encrypt with the IV and salt. On the flip side, you would do the same, Then if you wanted to really create havoc for any cryptographers trying to crack your AES files, you would simply put in a randomized byte obfuscator so that the file would look to be complete jibberish because of the random encoding blocks stuck throughout the file, from beginning to end. This way the cryptographer couldn't simply divide up the bytes and run some kind of sequence check at the first, last, 10th, 20th, byte, etc. Rather, make the obfuscation randomized whereas the byte sequence would simple identify the XOR'ish hash and the location of the next byte sequence. And since it would be random, the sequence could potentially exist on the 999th out of 1500 blocks and on the 1500th block, or it could be on the 2nd, 55th, and 99th out of a 100 blocks. The bottom line is that it would force the cracker to take a pass on the decryption. Personally, I wouldn't even attempt to crack something like that. Well, unless it was a certain supermodel's <blank> ... Then, maybe.
Thanks for great software,
Damian Kohlfeld
Here's why, if the password was NOT hashed with an HMAC inside the encrypted file, decryption software would have to guess the 32byte key, and decrypt the entire file, or a portion of the file, and then "magically determine" the file type to see if it indeed succeed at cracking the key. The software would have to look for dictionary words, etc, to see if it indeed had cracked the file. Given the numerous types of files encrypted with AES, this would take an extraordinary amount of time to do, because, there are potentially thousands of possible file types being encrypted and the number of cycles necessary to determine if the correct password (key via the hash/8192 iterations) has been found would be incredibly high because there would be no way for the decryption software to know whether or not it had guessed the correct password without performing some intelligent analysis on the output.
By storing the password in the file, encrypted/hashed, with an HMAC, yes, you have a fast way of checking whether or not the password is valid, BUT, the MASSIVE DOWNSIDE to this convenience is that it also makes it several thousand times more convenient for the decryption software to guess passwords.
Since the decryption software simply needs to perform the exact same password check that AESCrypt does during a dictionary/brute force/etc attack, there is no need to even look at the ciphertext at all. The decryption software will know immediately within a couple cycles whether or not it has guessed the correct password.
Personally, I would rather enter an incorrect password, and decrypt my aes file into a pile of jibberish, than get an immediate response saying "Bad Password".
Essentially, what you are doing here is enabling a "Quick Validation" of the password for decryption software.
Without this "feature" the decryption software would have to validate the jibberish produced for each possible key tried to validate whether or not the correct password was guessed. The evaluation of this "jibberish" would take an estimated millions of even billions of extra cycles to analyze the output from the guessed decryption key.
I can see the need for the convenience for knowing whether your password is correct or not for website logins, etc, and basic public and non-secure data. But, for personal cryptography files where security is paramount, giving decryption software a major leg up by letting it know right away whether a password is good or bad is a bad idea IMHO.
Personally, I use the Rijndael 256bit 128block to handle encryption/decryption in all of my software for persistent data.
Since my algorithms do not give any sort of easy spot check password/key validation, any decryption software would not only have to guess my key, but, it would then have to figure out what the heck it was looking at after decryption. In fact, most of the data which I encrypt is in essence a result of a binary serialization using a custom serializer, so, even if the attacker was successful at decrypting one of my "Binary Blobs", it would have no clue that it had actually guessed the correct key, because even after decryption, the file still looks like random jibberish.
I truly believe that due to the number of fileformats being encrypted using this software, removing the password check would make it thousands, if not millions of times more secure since the decrypt process would have to somehow identify the resultant decrypted file as a "known file type", or as "something not encrypted anymore".
Being that performing a simple hash/hmac check requires only a few cycles compared to intelligent analysis of decrypted content, you can see my argument here.
It would be extremely easy for me to right a brute force/dictionary attack against the .aes file format since all I would have to do is validate the password, not validate the resultant decrypted file(s). That attack would be very simple and be able to run trillions of iterations compared to software which not only had to guess the password, but, decrypt part of the cipher text and make an informative decision as to whether or not the decrypted ciphertext was in fact really decrypted.
I know this to be very true because just in my personal experience with my own software, since I don't have a quick and easy password check built into my ciphertext, there have been many times, where I decrypt something with the incorrect password/salt/IV and then the decrypted cipher text simpy doesn't work at all, When I look at the decrypted ciphertext in a hex editor, there is no way for me to tell if it is the original plaintext or not because it's just a binary blob of jibberish. So, with my own AES implementation, a hacker would have an impossible time decrypting one of my files since the binary jibberish before and after the decryption process would look the same.
Also, even if a hacker wrote software to specifically locate certain identifiers within by decrypted serialized blob so that it could "know" whether or not it was successful in decrypting my ciphertext, that would only work on that particular filetype. This, given that I use my AES 256 implementation on a wide variety of filetypes, even with a somewhat simple password, it would be nearly impossible for a hacker to decrypt my information, because of the numerous types of files encrypted.
Personally, if I were in your shoes, instead of offering up a simple way to verify the key to the decryption software, I would use a begin-end obfuscation technique which would be applied to the ciphertext FULLY before encryption/after FULL decryption and would require the entire file to be decrypted in order for the obfuscation software to serialize/deserialize the input/output. This way, after each key is tried, the entire ciphertext would have to be decrypted, then and only then could the obfuscation software be run to restore the original ciphertext.
That way, decryption software would not only have to guess the key,to validate it was working, it would have to completely decrypt the file, then run the obfuscation algorithm to be able to validate whether or not the key was correct. A simple streamed obfuscation software would be enough to make this happen if it required bytes from throughout the file to reproduce the original output. Just an XOR + some magic would be necessary, not much at all.
This way, instead of taking a billionth of a second to validate a key, it would take a second or two to validate the key. So, that would make decryption software work a billion times harder, or even more, to guess a single key.
So, with descent obfuscation on the input (inserting bytes all throughout the file, perhaps randomly, for de-obfuscation) and removal of the super easy password validations, I think it would be absolutely impossible to write a decryption mechanism for a file encrypted in such a way. If you think about it, writing some simple obfuscation mechanism wouldn't be that hard at all. The best part of all, is that by adding a simple obfuscation mechanism, the hacker/cracker would literally need to decrypt the entire file to even begin checking if the key was valid. This would take an eternity per guess compared to the super easy password check mechanism currently built into the software.
I love the software, but, I just think the password check convenience would make any attempts at decryption actually viable in the eyes of the cryptoanalyst. Personally, if someone gave me the format I described above, and told me to write a brute force attack against, I would take a pass. Any government agency, unless completely ordered to, would take a pass at something so futile. But, with the way it is now, I could write a brute force attack in minutes. Cracking the key is made so much easier by the ability to instantly know whether or not I've got the correct key.
Crypto software isn't just about keylength and complexity. Once could have a very simple key, yet with a full file obfuscation technique, it could potentially take an attacker years to guess it because of all the work necessary to check each key.
My personal contribution to cryptography was when I wrote something called WepBrute during the times when AirSnort and Kismet were popular. My approach was much different. Instead of looking for "weak 4's" in a lot of packets, I simply wrote an extremely intelligent dictionary attacker which used combinations of names, dictionary words, words with capitalized first and last letters, Number-Letter substitution, i.e. I looked for 3 instead of E or e in a word, I also would do all of the above with a single non-alpha character at the beginning of the end of a password. So many people do that it's almost comical.
Anyway, using AisSnort I tried cracking several networks with no success because they did not send enough traffic to capture and take advantage of the "weak 4's",
After writing WebBrute, I was able to crack about 80% of these exact same networks in literally seconds. All I needed was a single packet, and I was off and running. It was amazing software. Literally, I would go in for security audits and start reading off the WEP keys within seconds 80% of the time.
Granted, that was also because of the structure of the WEP key system, and how easy it was for a quick password verification against the cipher text. Since AES is not structured in that manner at all, there is no quick way to verify passcodes, except in your implementation.
The software works great, etc, but, I just think you've made it way too easy for crypto people to crack the passcodes, it's like giving them a slice of pie before they even start.
I say get rid of the password validation check and let the crackers suffer and decrypt the entire file with each key and then check the jibberish to see if it recreates a file, then after that, they have to identify if it's a valid file type! Basically, it would be impossible to crack. It would simple take too long. Your talking about a zillion lines of code, not to mention the massive amount of File IO necessary to fully decrypt everytime to check a key. What if it was a RAR file that was a few gigs, it would take forever to check even a single key.
You should at least make it an option for the user. Build it directly into AESCrypt, it would be very very simple to do, you would simply take the password, iterate and hash it 8192 times, then just encrypt with the IV and salt. On the flip side, you would do the same, Then if you wanted to really create havoc for any cryptographers trying to crack your AES files, you would simply put in a randomized byte obfuscator so that the file would look to be complete jibberish because of the random encoding blocks stuck throughout the file, from beginning to end. This way the cryptographer couldn't simply divide up the bytes and run some kind of sequence check at the first, last, 10th, 20th, byte, etc. Rather, make the obfuscation randomized whereas the byte sequence would simple identify the XOR'ish hash and the location of the next byte sequence. And since it would be random, the sequence could potentially exist on the 999th out of 1500 blocks and on the 1500th block, or it could be on the 2nd, 55th, and 99th out of a 100 blocks. The bottom line is that it would force the cracker to take a pass on the decryption. Personally, I wouldn't even attempt to crack something like that. Well, unless it was a certain supermodel's <blank> ... Then, maybe.
Thanks for great software,
Damian Kohlfeld