Behind the WinZip KDF optimization
#1
With the latest beta-version of hashcat I've added support to crack WinZip encrypted (password protected) archives. 

While I was implementing the kernel it turned out that their use of PBKDF2 (aka PKCS #5 v2.0) is suboptimal in some way and that an attacker can exploiting this to optimize the KDF generation, which basically means higher crack performance. 

I'm not aware of any other cracker making use of this optimization, but I might be wrong. 

It's not really rocket sciene. All you need to understand how PBKDF2 works internally and how it's used within the WinZip KDF. Therefore I don't think it's a problem to expain it. Funny side fact is that 1Password made the same mistake a few years ago, too. 

So here it is:

Depending on the encryption mode the user selects, WinZip uses PBKDF2-HMAC-SHA1 to derive a 256, 384 or 512 bit key. That alone should ring a bell, because what you're selecting is AES cipher with 128, 192 or 256 bits. The extra bits could be explained with an optional feature of the algorithm, that is when it adds 16 "checksum" bits additionally, but we don't make use of it here (even if we could). You'll see later what's behind this oversized key selection.

Back to PBKDF2. One of PBKDF2 features is that you can select a specific key output length, for whatever you need it for, regardless of the real output size of the selected hash. For example, even if SHA1 creates 160 bit only, PBKDF2 can create 512, or more. So the question is how it does that. The answer is very simple. It mixes in a fixed but increasing counter value for each chunk. For example for the first 160 bits, it adds a 0x01 to the data part used in HMAC, and adds a 0x02 for bits 160-320, 0x03 for bits 320-480, and so forth. Resulting in completely different chunks of size 160 bits. Finally it appends all the chunks together and truncates to the selected user output length. Now that's a cool thing, it could be called a feature of PBKDF2 (and actually I think it does so), because it enables the parallel computation of a PBKDF2 key, even for the defender.

Now let's take a look at the WinZip use of PBKDF2 output key:

Quote:key = substr (key, key_len, key_len)

Now we can see why they let us calculate the key length with the double of the size of the cipher. Because what they wanted to achieve here is that the key changes with the selected encryption mode, even with the same user defined password, simply by choose a different key offset. And that's the problem. 

In numbers:

AES256:
  • total keylength: 528
  • total chunks: 4
  • key offset: 256
  • key chunk: 2
AES192:
  • total keylength: 400
  • total chunks: 3
  • key offset: 192
  • key chunk: 2
AES128:
  • total keylength: 272
  • total chunks: 2
  • key offset: 128
  • key chunk: 1
We have all the information to individually calculate those chunks from each other. In other words, there's no need to start the KDF computation from 0. We can start the KDF computation from the selected key chunk, not from 1. That means we can improve the calculation of an AES256 key by 25%, AES192 by 33% and suprisingly AES128 0%.

Have fun!
#2
JtR had that optimization between May 2013 and July 2014 but it was accidentally removed by JimF when he did other improvements. And it was also never implemented for GPU. That is fixed now.

I believe you still miss an opportunity of early-reject that makes for even more boost. JtR always does a single chunk of PBKDF2 regardless of key size. So compared to the naive implementation we can actually boost AES256 key by 300%, AES192 by 200% and we do get a boost for AES128 too, of 100%.

Putting it another way, we can do same speed regardless of type.