Does an encrypted file show the same number of bytes?

Discussion related to AES Crypt, the file encryption software for Windows, Linux, Mac, and Java.
Post Reply
Petraco
Posts: 3
Joined: Tue Nov 07, 2017 4:06 pm

Does an encrypted file show the same number of bytes?

Post by Petraco »

Every file has a almost a unique number of bytes. If I encrypt a file with AES Crypt, is the number of bytes in the original file detectable from the encrypted file?

My worry is this: an eavesdropper may not know the content of my encrypted file, but if s/he already has a copy of the file, s/he may be able to determine whether my encrypted file is or is not that file.

Anybody got any answers.
User avatar
paulej
Posts: 593
Joined: Sun Aug 23, 2009 7:32 pm
Location: Research Triangle Park, NC, USA
Contact:

Re: Does an encrypted file show the same number of bytes?

Post by paulej »

That's an interesting question, not because I don't know the answer, but because I'm quite interested in knowing why you ask.

Don't need to tell me. I can imagine a few scenarios. For example, a whistle-blower that wants to sneak out documents without those documents being potentially matched based on file size.

Presently, it is possible to take an AES Crypt file and determine the precise size of the original file. Of course, one cannot determine the contents.

I have a laundry list of feature requests, and I could definitely see adding this to the list (if you are interested). However, to do this I would need to create a new version of the file format presently defined. There is no way to sneak in random octets into the current information flow.

If I were you and wanted an immediate solution to the problem, I would take the original file and put it in a .zip file. Then I would add random stuff to the zip file. And by "random", I mean files with random data in it to make the ZIP file less compressible. (If you were to add files containing nothing but zeros, for example, that would compress to an amazingly small file, which is not what you want.) I can definitely create some files with random data in it for you, if you're interested.

Once you have the ZIP file in hand, encrypt it. Rename it, too, so nobody would know it's a ZIP file.

In that way, you can vary the size of the file. The recipient would need to know it's a ZIP file so they can open it. Inside, there would be the file you intended to send along with several nonsense files (all of which could be clearly named so as to avoid confusion).

If I were to do this programatically, what I would do add a switch to aescrypt like this:

Code: Select all

aescrypt -e -p foo -x 25835 file.doc
The -x here could mean to pad the input stream with 25835 extra octets that get discarded when decrypting.

Like I said, it's an interesting feature. It might even have more utility than what I have in mind here. Alas, it's not there right now and it would have to be in a major upgrade to AES Crypt.

So for now, I'd suggest the ZIP file approach, stuffing random files into a ZIP file to mask the real file size. (You can even put a ZIP file inside a ZIP file.)
Petraco
Posts: 3
Joined: Tue Nov 07, 2017 4:06 pm

Re: Does an encrypted file show the same number of bytes?

Post by Petraco »

Thank you for your comprehensive answer to my question. It is extremely helpful.

Eavesdroppers who have access to the Internet giants’ email and cloud systems could have many purposes for wanting to know who was sending a known file to whom. And as you say, they could do that even though the file itself was encrypted by AES Crypt.

I can understand the solution of making a zip file and encrypting that. In the zip would be the secret file and a random one. Obviously, the sender would change the title of the encrypted file.

What I don’t understand is the purpose of hiding the fact that the encrypted zip file is a zip file. The email attachment is already suspicious as is it encrypted; an encrypted zip file is no more or less suspicious.

AES Crypt is, I guess, mostly used for email attachments and for archiving material which is often then stored in the cloud. Perhaps the AES Crypt Website could draw attention to the byte-number loophole. This comes into play when users encrypt single files for emailing or putting in the cloud, files which are not their own and which government or corporations might have an interest in knowing who possesses them and who is circulating them.
User avatar
paulej
Posts: 593
Joined: Sun Aug 23, 2009 7:32 pm
Location: Research Triangle Park, NC, USA
Contact:

Re: Does an encrypted file show the same number of bytes?

Post by paulej »

While I agree that the file length information could be used to make assumptions about what files a person is transmitting if they already have the plaintext file, it's not a design flaw nor a loophole. The point of AES Crypt is to encrypt, not try to hide the fact a whistle-blower, for example, is trying to leak documents.

As I said, I do think it's an interesting idea, but that's not a security issue.
What I don’t understand is the purpose of hiding the fact that the encrypted zip file is a zip file.
That's not essential, but it might further confuse people who are trying to identify leaks, for example, since they won't be able to find a.docx file, for instance, that has the same file length. If it's a ZIP, they'd know that fact. But, it doesn't give up confidentiality of the contents.
Petraco
Posts: 3
Joined: Tue Nov 07, 2017 4:06 pm

Re: Does an encrypted file show the same number of bytes?

Post by Petraco »

Yes, this issue has nothing to do with the security of AES Crypt in encrypting the contents of files, but is about file identification after encyption. On the basis of what you have explained above, are the following steps valid?

Purpose: to securely conceal the number of bytes in an ecrypted file.

1. Take the target file and another random file and place them together into a zip file.
2. Encrypt the zip file using AES Crypt.
3. Send the encrypted file as an email attachment or upload it into cloud storage in the knowledge an eavesdropper cannot identify the file from its byte count.
User avatar
paulej
Posts: 593
Joined: Sun Aug 23, 2009 7:32 pm
Location: Research Triangle Park, NC, USA
Contact:

Re: Does an encrypted file show the same number of bytes?

Post by paulej »

Yeah, that would prevent anyone who has the plaintext file from matching it with the length of the encrypted file. Just make sure those other files are truly random in size. If you used the same additional file in every instance and there were many files to inspect, a person with all this data might be able to determine that some fixed size additional file was used. Also, make sure that additional file isn't all zeros or something and effectively lead to a fixed length addition. I would use varying sized files with random octets inside, as those aren't compressible. That, or configure your zip tool to store, not compress, data.
Post Reply