Masks for Multiple Language Charsets in UTF-8
#1
Hi all.

I've read pretty much everything i can find on the subject of masks and charsets, but can't find or work out a solution for this issue. For the record, the resource I most followed was: http://www.netmux.com/blog/ultimate-guid...-using-has, in concert with the FAQ and Wiki entries on custom character sets and masks.

I am trying to adapt the rockyou masks to support both the Russian and Basic Latin (English) character sets within the same password strings. The hashes were originally created on a system with UTF-8 encoding. From my understanding, the best (only?) way to create UTF-8 representation is to use --hex-charset, with -1 being the first byte range and -2 being the second byte range. For the record, I'm able to crack a password which uses ONLY the Russian language.

I've tried creating masks where ?1/2/3/4 are the literal characters, but it was unsuccessful in cracking any known passwords. (The cracking was done on an Ubuntu system with hashcat 4.x with UTF-8 as the locale/environment.) I've also tried cracking hashes of known passwords solely using Russian which were created on UTF-8 by using the built in Russian character sets, and that fails. (1 byte vs 2 bytes I'm assuming.)

Here is a mask which successfully cracks a 3 character (6 byte) Russian password when used with --hex-charset:
Code:
d0,808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeaf,d0d1,b0b1b2b3b4b5b6b7b8b9babbbcbdbebf808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9f,?1?2?3?4?3?4

The issue I'm encountering is that it appears that the Basic Latin character set in UTF-8 is encoded with only one byte. Therefore, a 2 bytes per character mask will not work. I used the same password cracked with the above mask, appended a Latin 's' (lower case s) to it, and updated the mask line to the following, hoping that addressing a Latin character with \x00\x## would work. It does not. It appears that for whatever reason in the combination of hashcat, hash environment, crack environment, and encoding specs, that "s" in UTF-8 is just \x73, not \x00\x73.

Code:
00d0,808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeaf4142434445464748494a4b4c4d4e4f505152535455565758595a,00d0d1,b0b1b2b3b4b5b6b7b8b9babbbcbdbebf808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9f6162636465666768696a6b6c6d6e6f707172737475767778797a,?1?2?3?4?3?4?3?4

(If need be for anyone, i can break down what is what within the mask. But I'm assuming anyone who knows enough to help answer the question also knows enough about character sets to be able to parse it for themselves if needed.)

And obviously, that Latin character could be anywhere inside the password, not just at the end, so the specific mask isn't the important part. 

So, I guess what my most direct question is, is this possible? Is it possible to setup a mask with a variable length, optional or dependent component? For example, using the mixed-language hex charset, is there a way to tell it to ignore the first ?1 if the next character in the mask will be ?2 between \x41 and \x5a? Or, a even a simple way of saying "some of these are one byte, and some are two bytes"? Or, some other workaround?  Also, if I'm entirely barking up the wrong tree with a core assumption here, please let me know. 

Any other thoughts on what i'm missing, or something else I should try?

Thanks in advance.
A guy named Lou.

(So, it looks like I rambled a bit. Please feel free to ask if you want clarification on anything.)

EDIT to Add - No, my test didn't work.


Messages In This Thread
Masks for Multiple Language Charsets in UTF-8 - by Loopy - 07-11-2018, 05:30 PM