Possibility of zip archive password cracking?
I just had someone privately request RAR support today. Apparently there is a lot of interest in such a thing by people who find bread crumb trails from hackers (apparently encrypted RARs are a favorite way to exfiltrate data).
Tarcel, you have 4 or more files in your archive and it was generated with WinZIP version prior to 8? In that case, the entropy of the IVs is much lower due to a bug in winzip implementation and such an archive can be broken easily in 2-3 hours on a CPU (BTW that attack is not applicable on GPUs) even if a complex password was used. ElcomSoft's software does that I think.

Other than that, there is no GPU-enabled cracker that can crack classic (ZIP 2.0) encryption on AMD GPUs yet . Mine would be the first to do that Smile

Also, cracking speed with classic encryption depends a LOT on the number of files in archive. Optimal speeds are reached when there are at least 3 files. The reason for that is that the password verifier is just one byte long and you have 1/256 chance of a false positive - in that case you need to decrypt the whole file, decompress it, calculate CRC and compare it to the header CRC. There are some early heuristic checks that can be done that gets the possibility for a false positive from 1/256 to say 1/1024 or more, but that's still not enough. If you have say 3 files in archive, then you have 3 verifier values available that get the possibility for false positive down to 1/16 million. With 4 files it gets down to 1/4 billion.

To summarize: if your zip file has just one big file in the archive, password recovery proceeds at very low speed. If you have 3 or more files, you can utilize the GPU at 100% to recover the password - my program achieves speeds of 630 million c/s on 5870 at stock clocks. Paradox is that NVidia is faster than AMD in that particular algo and a GTX580 can get you about 700-800M/s.

ZIP 3.0 (strong AES encryption) is a completely different beast, it is rather ALU-bound (PBKDF2, 1000 iterations) and it does not matter how much files you have in the archive. Smaller files typically lead to faster cracking though.

As far as RAR is concerned, things get very ugly. There are two options here - using header encryption (-hp mode) and not using header encryption (-p mode).

-hp mode is very easy and does not depend on the number or size of files in archive. All you need is do the SetCryptKeys transformation (apply ~262000 times password+salt then do sha1 on it) then get the encryption key and IV, decrypt the header and compare part of it to a well-known value.

-p mode is very nasty because there is no "early check" - you basically need to replicate the complete unrar functionality, decrypt the file, decompress it, calculate CRC, compare. The RAR compression algorithm is proprietary and badly documented. I am currently working on that and at that point I successfully crack both types on CPU (GPU support not ready yet). Unfortunately there is a big problem with -p mode and large files in archive - speed drops a lot and it all becomes a disaster. I am now working on some early heuristic checks but still it's not adequate and I am kinda desperate. It works like charm with archives in -p mode with files less than 1-2MB in size though.

Also, RAR password hashing is a _BRUTAL_ one, much much worse than WPA or MSCASH2. My speed projections on 5870 are about 14000-15000 c/s although it depends on password length and this is for length=8. As a comparison, WPA on 5870 does about 100000 c/s, almost 8 times faster.

Overall, I think cracking archives is an interesting thing to do. I can also help if you decide to implement it in oclhashcat.
Wow, that sounds depressing for RAR cracking. Perhaps there is a design weakness that could lead to a shortcut? I wish I knew something about cryptanalysis.
(02-04-2012, 04:04 AM)chort Wrote: Wow, that sounds depressing for RAR cracking. Perhaps there is a design weakness that could lead to a shortcut? I wish I knew something about cryptanalysis.

There are none. Practically speaking, only dictionary attacks and rule-based attacks with limited keyspace are applicable, bruteforce or markov are pure madness. Like I said, RAR without header encryption is even worse. I mentioned looking for heuristics to do an early check, but this failed because the RAR compression algorithm is awful. You can hope to cut off some false positives after at least 32 kilobytes of input data have been AES decrypted and decompressed. Frankly said, this is very bad for password recovery. The only hope would be some fast way to AES decrypt large (32K) blocks of data, but GPUs are not the right hardware for that. Perhaps using AES-NI instructions would make some difference, but my CPU does not support them. Thus, I have some archives that I crack at the "amazing" speed of ~ 100 c/s. Of course, the GPU would calculate the AES keys (based on password and sha1 transformations) much faster, even though it's hundreds thousands SHA1 operations, however password verification is so slow on host (cpu-side) just because you need to read at least 32KB of data, decrypt and decompress it. It just sucks.

Anyway, it could be worse. 7Zip password protection for example is real badass, it's twice as slow as RAR. Still haven't researched the 7z format in more details, but if it turns out they do the same as RAR does with -p mode archives, coding a 7zip cracker becomes a waste of time.

(02-04-2012, 11:33 PM)gat3way_ Wrote: Practically speaking, only dictionary attacks and rule-based attacks with limited keyspace are applicable, bruteforce or markov are pure madness.
Thus, I have some archives that I crack at the "amazing" speed of ~ 100 c/s.
This is one of the reasons I've been suggesting that something in the hashcat family gets support for multiple dictionaries, like those archive cracking programs have.
What do you mean by "multiple dictionaries?" oclHashcat-plus can use multiple dictionaries if you give it a directory as the dictionary parameter. It can also use two dictionaries in combination by specifying the combinator attack.
Words from one dictionary added to words from another dictionary to make a string to be tested, with mangling rules for each dictionary.

Suppose one dictionary is bugs.txt and another is animals.txt.
I'd want apply a rule to bugs.txt to get ant00, ant01, ant02,... and another rule to animals.txt to get dog@, dog#, dog%,..., to be combined as ant00dog@, ant00dog@,... ant02dog%,....

The work would be done by the program, in a way similar to what some archive/document crackers, not by first generating a text file of the combinations and using that as the dictionary.
oclHashcat-plus can do this, but not with rules (currently).
To add to what chort said, it's also possible to do it with Oclhashcat-plus if you use dictionary + rules. You have bugs as a dictionary, use a first rule file to add the 00, 01, 02, etc, a second rule file containing the "rulified" animals.txt like $d$o$g, $c$a$t, $c$h$i$c$k$e$n. Finally, a third rule file that adds the symbols $@, $%.
I'm looking for something more straightforward, like in the "Password Definition Language" of the Parallel Password Recovery program mentioned earlier in this thread. See 4.2.2. "Dictionary words and their modifiers" at http://www.parallelrecovery.com/password...nguage.php:
$w = a word from the main dictionary
$u = a word from the user dictionary

The example I gave would be very easy to code.

What I also want is the trackability to record that I ran bugs.txt with the 'Append2Numbers rule combined with animals.txt with the 'Append1Special rule. Then I'll have a record of which dictionaries were run with which rules, to avoid repeating attacks, and plans which ones to try next.