07-11-2018, 06:00 PM
(07-11-2018, 05:46 PM)undeath Wrote: Your findings are correct. UTF-8 is fully ASCII-compatible and latin characters (along with numbers and the basic set of special characters) are represented with only one byte.
As you already noticed, hashcat is oblivious of character encodings (except for --encoding-from/--encoding-to) and thus the issue of mulitbyte encodings is an open problem.
Of course you can construct masks that assume certain characters are two bytes while others are one, but you'll need a single mask for each possibility.
OK, thank you Undeath. Glad to know I wasn't missing something super obvious or misunderstanding how it all works. The creation of individual masks for each possible combination of 1 and 2 byte characters is... not appealing.
Future feature request?