Masks for Multiple Language Charsets in UTF-8
#2
Your findings are correct. UTF-8 is fully ASCII-compatible and latin characters (along with numbers and the basic set of special characters) are represented with only one byte.

As you already noticed, hashcat is oblivious of character encodings (except for --encoding-from/--encoding-to) and thus the issue of mulitbyte encodings is an open problem.

Of course you can construct masks that assume certain characters are two bytes while others are one, but you'll need a single mask for each possibility.


Messages In This Thread
RE: Masks for Multiple Language Charsets in UTF-8 - by undeath - 07-11-2018, 05:46 PM