Crypto for newbies: How to crack Office 97-2003
#1
Hello,

Today I will share with you a little info about how Word 97-2003 encryption works.
If you found some mistake, please, fell free to report it.
If you found things that can be optimized, fell free do share it in comments.


Introduction

Microsoft Word, from version 97-2003 use a crypto scheme defined in MS-OFFCRYPTO. (https://msdn.microsoft.com/en-us/library...e.12).aspx)
The Encryption Key Derivation can be found here: https://msdn.microsoft.com/en-us/library...12%29.aspx
After reading this documents, I dig a little more trying to find an easiest/better explanation.


Understanding

Thanks to atom, this was not a hard task.
Here is everything explained step-by-step: https://hashcat.net/forum/thread-3665.html
So, you will see that MS Word uses RC4 + MD5 for mode $0 and $1 and RC4 + SHA1 for mode $3 and $4.
RC4 = algo used for encryption.
SHA1 = hash function.
MD5 = hash function.
So, you will use RC4 and MD5 or SHA1.
Where you find this $0, $1, $3, $4? It is the first part of hash extracted with office2hashcat.py (https://github.com/stricture/hashstack-s...hashcat.py)
Eg. of extracted hash: $oldoffice$1*d6aabb63363188b9b73a88efb9c9152e*afbbb9254764273f8f4fad9a5d82981f*6f09fd2eafc4ade522b5f2bee0eaf66d (https://hashcat.net/forum/thread-3665.html)
As you can see, after the word old office, we have $1, so, this hash uses RC4 + MD5.


Extracted hash

The extracted hash have this fields:
1) Kind of encryption scheme used ($1, $2, $3, $4)
2) Salt => what are between 1 and 2 asterisk => d6aabb63363188b9b73a88efb9c9152e
3) EncryptedVerifier => what are between 2 and 3 asterisk => afbbb9254764273f8f4fad9a5d82981f
4) EncryptedVerifierHash => what are after 3 asterisk => 6f09fd2eafc4ade522b5f2bee0eaf66d


What to do now?

Atom posted this:
Quote:"KDF

1. Generate 16 byte random salt
2. Calculate MD5 of unicode version of the password
3. Truncate 16 byte result to 5 byte
4. Generate a string of length 336 byte by repeating the string "$digest$salt" 16 times -- (16 * (5 + 16)) = 336
5. MD5 the 336 bytes
6. Truncate 16 byte result to 5 byte
7. Append 4 byte zeros to result
8. MD5 the 9 bytes
9. Use 16 byte result as 128 bit RC4 Key
10. Decrypt encryptedVerifier with RC4 to decryptedVerifier
11. Decrypt encryptedVerifierHash with RC4 to decryptedVerifierHash
12. MD5 the decrypted encryptedVerifier
13. Compare 16 byte result with decrypted encryptedVerifierHash"

This is what we will do, step-by-step.


Preparation

To this task we will need some tools.
I will use here online tools, to be easy for everyone follow.

 1) http://rc4.online-domain-tools.com => RC4 online

 2) https://www.mobilefish.com/services/lati...to_hex.php => Convert password plain text to UTF-16LE (LE = Little-Endian)

 3) https://www.fileformat.info/tool/hash.htm => Hash MD5 as hex input, instead of ASCII input

 4) https://github.com/stricture/hashstack-s...hashcat.py => Extract hash from office files.
       4.1) We will not need this, because we will use a provided hash.

 5) https://hashcat.net/misc/DocOld2010.doc => file used in this example. Generated by atom.


Hands on

So, we know the password is hashcat, this is useful to do every step and understanding what we are doing.
Let's do it

Quote:Password                                            hashcat
Password converted to UTF-16LE.         6800610073006800630061007400 => https://www.mobilefish.com/services/lati...to_hex.php

Salt                                                    d6aabb63363188b9b73a88efb9c9152e
EncryptedVerifier                                 afbbb9254764273f8f4fad9a5d82981f
EncryptedVerifierHash                          6f09fd2eafc4ade522b5f2bee0eaf66d

KDF

01. Generate 16 byte random salt
Quote:R = d6aabb63363188b9b73a88efb9c9152e

02. Calculate MD5 of unicode version of the password
Quote:R = MD5(6800610073006800630061007400) = 2303b15bfa48c74a74758135a0df1201 
https://www.fileformat.info/tool/hash.htm => use the field "Binary hash" and past 6800610073006800630061007400

03. Truncate 16 byte result to 5 byte
Quote:R = 2303b15bfa
Each byte is composed for 2 hex chars, so <23><03><b1><5b><fa> = 5 byte.


04. Generate a string of length 336 byte by repeating the string "$digest$salt" 16 times -- (16 * (5 + 16)) = 336
Quote:R = MD5(2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e2303b15bfad6aabb63363188b9b73a88efb9c9152e)

05. MD5 the 336 bytes
Quote:R = f2ab1219aec36ce247dfb13a03940d3e

06. Truncate 16 byte result to 5 byte
Quote:R = f2ab1219ae

07. Append 4 byte zeros to result
Quote:R = f2ab1219ae00000000

08. MD5 the 9 bytes
Quote:R = e5d13462ff792f3ed224acd4bfb03da9

09. Use 16 byte result as 128 bit RC4 Key
Quote:R = e5d13462ff792f3ed224acd4bfb03da9
http://rc4.online-domain-tools.com

10. Decrypt encryptedVerifier with RC4 to decryptedVerifier
Quote:R = RC4(afbbb9254764273f8f4fad9a5d82981f) = d6aabb63363188b9b73a88efb9c9152e

11. * Decrypt encryptedVerifierHash with RC4 to decryptedVerifierHash *
Quote:R = RC4(6f09fd2eafc4ade522b5f2bee0eaf66d) = 1aad4f1dd3efa5f11ca9670a4b7335fd

See note below

12. MD5 the decrypted encryptedVerifier
Quote:R = MD5(d6aabb63363188b9b73a88efb9c9152e) = 1aad4f1dd3efa5f11ca9670a4b7335fd

13. Compare 16 byte result with decrypted encryptedVerifierHash
Quote:MD5(RC4Decrypt(EncryptedVerifier)) == RC4Decrypt(EncryptedVerifierHash)
MD5(d6aabb63363188b9b73a88efb9c9152e) == 1aad4f1dd3efa5f11ca9670a4b7335fd
1aad4f1dd3efa5f11ca9670a4b7335fd == 1aad4f1dd3efa5f11ca9670a4b7335fd

So, as we can see, the password are correct.

* For some reason that I haven't a single clue, this step always give me error. To found this value, you should process this:
RC4(decryptedVerifier + decryptedVerifierHash) = RC4 (afbbb9254764273f8f4fad9a5d82981f6f09fd2eafc4ade522b5f2bee0eaf66d)
RC4 = d6aabb63363188b9b73a88efb9c9152e1aad4f1dd3efa5f11ca9670a4b7335fd
The first 16 bytes = decryptedVerifier = d6aabb63363188b9b73a88efb9c9152e
The last 16 bytes = decryptedVerifierHash = 1aad4f1dd3efa5f11ca9670a4b7335fd
The RC4 key is: f2ab1219ae (step 06)


Understanding how to crack the password

Atom said:
Quote:"Exploitation

The idea is to iterate through those 2^40 combinations, beginning from step 8. Once we find the correct RC4 Key, which is the case when step 13 is true, we do not need to do those steps ever again. From now on, in oclHashcat, we will just calculate steps 1-5 and then compare the first 5 byte with our pre-cracked intermediate hash. That's the meet-in-the-middle attack."

Translating: Generate 5 byte -> append 4 byte zeros -> MD5(5 byte + 00000000) -> Using the result as RC4 key -> MD5(RC4Decrypt(EncryptedVerifier)) == RC4Decrypt(EncryptedVerifierHash) -> If this is true, you found a RC4 key to decrypt this file.


Colliding

Ok, now you know the RC4 key.
You can decrypt the file with this key, using other tools, but, you still do not have a password to open the file.
Well, decrypting the file, you have access to it's content, but, the point here is obtain a password to open it, not only decrypt it. 
Here, when we generate a candidate password and go through step 01 to 06.
If the step 06 is equal to the hex key found, you found a valid password. 

Quote:Password candidate                    zvDtu!
Password converted to UTF-16LE 7a0076004400740075002100

Salt                                          d6aabb63363188b9b73a88efb9c9152e
EncryptedVerifier                       afbbb9254764273f8f4fad9a5d82981f
EncryptedVerifierHash                6f09fd2eafc4ade522b5f2bee0eaf66d

KDF

01. Generate 16 byte random salt
R = d6aabb63363188b9b73a88efb9c9152e

02. Calculate MD5 of unicode version of the password
R = MD5(7a0076004400740075002100) = 2c280e504af43aaa1d6bbfb205302424

03. Truncate 16 byte result to 5 byte
R = 2c280e504a

04. Generate a string of length 336 byte by repeating the string "$digest$salt" 16 times -- (16 * (5 + 16)) = 336
R = MD5(2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e2c280e504ad6aabb63363188b9b73a88efb9c9152e)

05. MD5 the 336 bytes
R = f2ab1219ae2fa884df9f25a50b9dc5cb

06. Truncate 16 byte result to 5 byte
R = f2ab1219ae

07. Append 4 byte zeros to result
R = f2ab1219ae00000000

08. MD5 the 9 bytes
R = e5d13462ff792f3ed224acd4bfb03da9

09. Use 16 byte result as 128 bit RC4 Key
R = e5d13462ff792f3ed224acd4bfb03da9

10. Decrypt encryptedVerifier with RC4 to decryptedVerifier
R = RC4(afbbb9254764273f8f4fad9a5d82981f) = d6aabb63363188b9b73a88efb9c9152e

11. Decrypt encryptedVerifierHash with RC4 to decryptedVerifierHash
R = RC4(6f09fd2eafc4ade522b5f2bee0eaf66d) = 1aad4f1dd3efa5f11ca9670a4b7335fd

12. MD5 the decrypted encryptedVerifier
R = MD5(d6aabb63363188b9b73a88efb9c9152e) = 1aad4f1dd3efa5f11ca9670a4b7335fd

13. Compare 16 byte result with decrypted encryptedVerifierHash
MD5(RC4Decrypt(EncryptedVerifier)) == RC4Decrypt(EncryptedVerifierHash)
MD5(d6aabb63363188b9b73a88efb9c9152e) == 1aad4f1dd3efa5f11ca9670a4b7335fd
1aad4f1dd3efa5f11ca9670a4b7335fd == 1aad4f1dd3efa5f11ca9670a4b7335fd

Here you can see that steps 01 to 05, everything is different.
But, after step 06, it is the same, for password <hashcat> and <zvDtu!>, so, both passwords can open the file.
Why this happen? Because we have to match only the 5 bytes of step 05 (40 bits) not the whole value (128 bits).
This is called collision because two different password generate the same 40 bits value.


Independent Key file

I saw a lot of programs and scripts that are designed to crack the RC4 key.
One of this programs have one feature that call my attention : "Search for file-independent key, allowing to instantly decrypt files with the same password"
Well, we have here a thing that is very useful: only one key to decrypt every file with the same password.
If you generate a new file with the password hashcat, you will see, the RC4 key will be different.
This means: for every single file, you have to crack the RC4 key, even if they use the same password.
I really liked this feature, but, how to replicate it?
We have salt here, what make every every key unique even with the same password. That is the point of using salt, by the way. Password + salt(random) = unique output.
After thinking, the answer was obvious: you have to process everything BEFORE you enter in the salt part.
What? Relax , lets do it.

Quote:Password                                   hashcat
Password converted to UTF-16LE 6800610073006800630061007400

Salt                                          d6aabb63363188b9b73a88efb9c9152e
EncryptedVerifier                       afbbb9254764273f8f4fad9a5d82981f
EncryptedVerifierHash                6f09fd2eafc4ade522b5f2bee0eaf66d


KDF

02. Calculate MD5 of unicode version of the password
R = MD5(6800610073006800630061007400) = 2303b15bfa48c74a74758135a0df1201 

03. Truncate 16 byte result to 5 byte
R = 2303b15bfa

Look at the step 03. The result is <2303b15bfa>.
This value IS the independent key file.
Why? Because every single time that you use the password hashcat, the result will be the same, because we do not have a interaction with salt in this step.
So, with this key, every file (Office 97-2003) created with the password hashcat, you will be able to decrypt it.
By the way, I do not know any program that make use of it, except for that one. It is paid.


Using Hashcat

After understanding how to do it manually, lets do it with hashcat.
First create a file and save this inside: $oldoffice$1*d6aabb63363188b9b73a88efb9c9152e*afbbb9254764273f8f4fad9a5d82981f*6f09fd2eafc4ade522b5f2bee0eaf66d
Now, we will understand the hashcat modes that you can use:
-m 9700 = find a password
-m 9710 = crack the RC4 key
-m 9720 = collide the RC4 key with a candidate password

So, we can do this path
hashcat -m 9700 -a 3 <file.hash> -i ?a?a?a?a?a?a => you will try to find a valid password to open the file. This mode will do the something that mode -m 9710 plus -m 9720
hashcat -m 9710 -a 3 --hex-charset ?b?b?b?b?b => this will recover the RC4 key only, not the password.
hashcat -m 9720 -a 3 <file.rc4> -I ?a?a?a?a?a?a => this will try to find a password from the RC4 key.

Inside <file.hash> $oldoffice$1*d6aabb63363188b9b73a88efb9c9152e*afbbb9254764273f8f4fad9a5d82981f*6f09fd2eafc4ade522b5f2bee0eaf66d
Inside <file.rc4>: $oldoffice$1*d6aabb63363188b9b73a88efb9c9152e*afbbb9254764273f8f4fad9a5d82981f*6f09fd2eafc4ade522b5f2bee0eaf66d:f2ab1219ae

You do not have to use the 3 options; use only -m 9700 OR -m 9710 then -m 9720.


Conclusion

That is the whole process that I did to understand how to calculate the RC4 key, RC4 independent key and how to collide the RC4 to found a valid password.
If you saw the flag $3 and $4, you should replace the MD5 for SHA1. 
If you like the post, let me know that.
It it have a positive review, I will do my best to write more post in this style.
Thank you for your time and patience.


Free Notes

I would like to give a special "thank you" for bmenrigh, Chick3nman, and to atom for your help providing me a lot of valuable info and/or help.

Source
http://www.woodmann.com/forum/archive/in...-2971.html
https://blogs.msdn.microsoft.com/openspe...ification/
https://hashcat.net/forum/thread-3665.html