Colliding password protected MS office 97-2003 documents - Printable Version +- hashcat Forum (https://hashcat.net/forum) +-- Forum: Misc (https://hashcat.net/forum/forum-15.html) +--- Forum: User Contributions (https://hashcat.net/forum/forum-25.html) +--- Thread: Colliding password protected MS office 97-2003 documents (/thread-3665.html) |
Colliding password protected MS office 97-2003 documents - atom - 09-08-2014 I recently worked on adding support to oclHashcat in order to crack the different versions of password protected MS Office documents. So far I've finished MS Office 2013, 2010, 2007 and the 97-2003 versions (there are different ones). The next version of oclHashcat (v1.31) will ship with support for all these versions. While I was working on the 97-2003 version I found out that there's a weakness in the scheme that I want to share. I'm not sure if this technique is already known, but I've never seen a cracker that does what is neccessary to exploit it. OTH, I don't know all of them. Also, as far as I understood it, this attack is different to the flaw discovered back in 2005, when Office is using the same keystream to encrypt two different versions / revisions of the same document, but don't nail me on that! The following explanations describe the vulnerability to a meet-in-the-middle attack. If you don't know what I mean with meet-in-the-middle attack, see here: https://hashcat.net/events/p13/js-ocohaaaa.pdf With a single hd7970 I'm able to collide every password protected MS Office 97-2003 documented using MD5/RC4 in less than 6 hours! And yes, you can open the document "normally" with a collided password. In order to explain it you first need to understand the KDF MS office uses. KDF
OK, what is this encryptedVerifier and encryptedVerifierHash? From oclHashcat's view it's just two blocks of random data of length 16 bytes. Why is MS office doing this? The answer is it enables a key validation. The idea is the following: generate 16 bytes of random data, then run it through MD5. Then store both, the original random data and the MD5 result after both were encrypted with RC4. Now, if one has the correct RC4 Key, he can decrypt back to the original random data and MD5 it. If the result matches the decryptedVerifierHash data we know we use the correct RC4 Key. Office will work the same way. MS original description: http://msdn.microsoft.com/en-us/library/dd920360%28v=office.12%29.aspx Problem The problem here is the truncation of the MD5 digest in step 6. OK, the key for the RC4 is 128 bit, but it is calculated out of a MD5 which uses just a 40 bit input (because the last 32 bit are just zeros). Exploitation The idea is to iterate through those 2^40 combinations, beginning from step 8. Once we find the correct RC4 Key, which is the case when step 13 is true, we do not need to do those steps ever again. From now on, in oclHashcat, we will just calculate steps 1-5 and then compare the first 5 byte with our pre-cracked intermediate hash. That's the meet-in-the-middle attack. To make it more comfortable, this technique requires two seperate oclHashcat modes. The first to brute-force the intermediate hash and once you cracked it just store it. Then you can use oclHashcat "normally" with the second mode. This mode works with the previosly cracked key and therefore only needs to calculates steps 1-5. Note that in this second mode you are not stick to Brute-Force, you can use any attack-mode you want. Actually you will not need to do that unless you try to find the real password that was used (forensic stuff). Just continue reading... So I made some real life tests how the cracking performance changes. One need to know RC4 isn't very GPU friendly. If we can avoid to compute it we will speed up the performance alot. The current speed, on a single hd7970, to crack password protected MS office 97-2003 documents is ~65 MH/s. That's still faster than any other cracker but if we exploit the weakness the numbers changes as following (on a single hd7970):
So we can speed up from 66MH/s to 390MH/s on a single hd7970. Not too bad, eh? Colliding With 390MH MH/s, all we need to find is a MD5 digest where the first 5 byte matches the precomputed intermediate hash. Problem is that in theory even by iterating through 2^40 combinations it's not guaranteed to find at least one matching one. But I believe there's still a good chance as it's only the first 5 byte we need to collide. After that, the collision will work as a valid password in MS Office. In the following experiment below I found my first collision after 14 seconds :-) Here's the log from my experiment that I made to ensure all this theoretical stuff is really working: Note: If you're wondering what this code section is about, ignore it. It's just info for myself if I want to reproduce it First mode, precompute intermediate hash Code: // kernel hacked at position step 5, check final hash to get out the intermediate hash Quote: Now that we have the intermediate hash $HEX[f2ab1219ae], we can move forward to our second module: Second mode, collide intermediate hash Code: // kernel hacked at Position step 5, check intermediate Hash: Quote: Collision found in 14 seconds :) Of course, there's no guarantee always to find a collision in 14 seconds, but it will not take to long... Test Yes, you can use the collision to open the document. If you want to try, here's the .doc I've been using: Document: https://hashcat.net/misc/DocOld2010.doc You can use both passwords:
Jens "atom" Steube RE: Colliding password protected MS office 97-2003 documents - epixoip - 09-08-2014 RE: Colliding password protected MS office 97-2003 documents - forumhero - 09-09-2014 atom > all RE: Colliding password protected MS office 97-2003 documents - Mem5 - 09-09-2014 Awesome ! Thank you ! Does it also work for Excel, PowerPoint 97-2003 password protected document ? Or only Word ? RE: Colliding password protected MS office 97-2003 documents - epixoip - 09-09-2014 works for all MS Office 97-2003 files RE: Colliding password protected MS office 97-2003 documents - blandyuk - 09-09-2014 This is awesome u so totally rock dude! RE: Colliding password protected MS office 97-2003 documents - atom - 09-09-2014 Thanks Guys! I've finished the special modes 9710 and 9720 for in BF mode for AMD cards. Beta-Tester can download latest beta version from beta site. Make sure you download beta17 or higher. To use the collider mode, you first need to crack the RC4 key as described above. Cracking the RC4 is done by mode 9710. I'd recommend you to use -o as it will save you some copy paste stuff. Code: ./oclHashcat64.bin -m 9710 hash -a 3 -w 3 ?b?b?b?b?b -o hash.rc4 Indeed it will be possible later on to not BF the RC4 key but use all attack-modes as you know them. BF is just cool as it's guaranteed to success in a short time. However I'd recomment yout to run your dictionaries first, with a small ruleset and if it does not hit, redo it with -a 3. Once you have the RC4 key, the output-line in hash.rc4 looks like this: Quote:$oldoffice$1*d6aabb63363188b9b73a88efb9c9152e*afbbb9254764273f8f4fad9a5d82981f*6f09fd2eafc4ade522b5f2bee0eaf66d:f2ab1219ae The RC4 key (in hex) was just apppend to the line. This line is exactly the format for the input-line for the collider mode 9720. That's why I recommend to use -o because you can use that file as hashfile for -m 9720 now. Code: ./oclHashcat64.bin -m 9720 hash.rc4 -a 3 -w 3 ?a?a?a?a?a?a If it exhausts, just add another ?a, it will collide sooner or later, for sure. Quote:$oldoffice$1*d6aabb63363188b9b73a88efb9c9152e*afbbb9254764273f8f4fad9a5d82981f*6f09fd2eafc4ade522b5f2bee0eaf66d:f2ab1219ae:zvDtu! Some other notes:
-- atom RE: Colliding password protected MS office 97-2003 documents - magnum - 09-09-2014 Cool. I don't quite see the gain with the mitm in this case though. If you skipped that and just made a "naive" yet optimized format, would you not get nearly 80Mp/s anyway since the significant work is RC4 anyway? And at first hit of that (within 2h on average), you'd be done. (Edit: you said 65Mp/s so 2:20h on average but I still don't see much point with mitm) That is unless you want to find more candidates for the same documents though (trying to spot the "real" password). In that case the mitm is a definitive speedup. Anyway, what really impress me is the RC4 speed you achieve! RE: Colliding password protected MS office 97-2003 documents - atom - 09-10-2014 You have a sharp eye! If you just want to crack the document you could stop the process after mode #1 and use the RC4 key to decrypt it. That would require special decrypting tools that know how to decrypt the .doc or .xls or whatever with only the RC4 key. I could imagine such tools exist. But that's another reason why I split the one mode 9700 into two modes 9710 and 9720. If you just want to crack the document and you have such a tool, all you need is 9710. But, as you said as well, there's often the case that you are not interessted in the document data itself. In that case you're only interessted in the real password that was used. That was actually my intention when I first thought about splitting the kernels, because I wanted to get rid of the RC4 stuff. This is where mitm idea came in. By removing RC4 stuff from the kernel, that would speed up th MD5 kernel. Afterwards, when I did that, I realized that it's not the full 128 bit MD5 digest that we need to hit, it's just 40 bits. This is when the Idea to collide it came in. RE: Colliding password protected MS office 97-2003 documents - atom - 09-10-2014 Another thing what I've just added, but only for the 9720 kernels, is that those hashes never get marked as "cracked" internally in oclHashcat. That means a matching password will be displayed or stored to outfile as normal, but afterwards oclHashcat can crack it again in case another candidate matches. This behavior is very unique to this hashmode. If you use this mode, you usually want to know all of the regular password that collide with the document password. I let it run for 20 minutes using the original document linked above. Here's a list of collisions that all do match each other, it's really funny to watch that:
-- atom |