Charset files don't behave as expected?

Charset files don't behave as expected? - Printable Version

+- hashcat Forum (https://hashcat.net/forum)
+-- Forum: Support (https://hashcat.net/forum/forum-3.html)
+--- Forum: hashcat (https://hashcat.net/forum/forum-45.html)
+--- Thread: Charset files don't behave as expected? (/thread-10025.html)

Pages: 1 2

Charset files don't behave as expected? - bbcjared - 04-19-2021

I have a hash of encrypted with Cyrillic password - "хуй", three characters long.

whenever I'm trying to use charset file, it fails to crack it.

official encoding file

Code:
>hashcat -d 4 -m 11300 -a 3 --custom-charset1 /home/maxim/tmp/hashcat-6.1.1/charsets/standard/Russian/ru_cp1251.hcchr bt_wallet_huy.hash ?1?1?1

Session..........: hashcat

Status...........: Exhausted

Hash.Name........: Bitcoin/Litecoin wallet.dat

Hash.Target......: $bitcoin$96$79e09876f9674db8e947bb8713ea5f9a14dc1be...5d1b88

Time.Started.....: Mon Apr 19 14:53:21 2021 (10 secs)

Time.Estimated...: Mon Apr 19 14:53:31 2021 (0 secs)

Guess.Mask.......: ?1?1?1 [3]

Guess.Charset....: -1 custom.hccrs, -2 Undefined, -3 Undefined, -4 Undefined

Guess.Queue......: 1/1 (100.00%)

Speed.#4.........:      13 H/s (3.82ms) @ Accel:2 Loops:256 Thr:1024 Vec:1

Recovered........: 0/1 (0.00%) Digests

Progress.........: 125/125 (100.00%)

Rejected.........: 0/125 (0.00%)

Restore.Point....: 25/25 (100.00%)

Restore.Sub.#4...: Salt:0 Amplifier:4-5 Iteration:131328-131446

Candidates.#4....: $HEX[b9b985] -> $HEX[b9d1d1]

Hardware.Mon.#4..: Temp: 47c Fan:  0% Util: 99% Core:1987MHz Mem:3802MHz Bus:8

or with my custom file like this:

Code:
Only those characters in hcchr file

>cat custom.hcchr

хуй

>hashcat -d 4 -m 11300 -a 3 --custom-charset1 custom.hcchr bt_wallet_huy.hash ?1?1?1

Session..........: hashcat

Status...........: Exhausted

Hash.Name........: Bitcoin/Litecoin wallet.dat

Hash.Target......: $bitcoin$96$79e09876f9674db8e947bb8713ea5f9a14dc1be...5d1b88

Time.Started.....: Mon Apr 19 14:53:21 2021 (10 secs)

Time.Estimated...: Mon Apr 19 14:53:31 2021 (0 secs)

Guess.Mask.......: ?1?1?1 [3]

Guess.Charset....: -1 custom.hcchr, -2 Undefined, -3 Undefined, -4 Undefined

Guess.Queue......: 1/1 (100.00%)

Speed.#4.........:      13 H/s (3.82ms) @ Accel:2 Loops:256 Thr:1024 Vec:1

Recovered........: 0/1 (0.00%) Digests

Progress.........: 125/125 (100.00%)

Rejected.........: 0/125 (0.00%)

Restore.Point....: 25/25 (100.00%)

Restore.Sub.#4...: Salt:0 Amplifier:4-5 Iteration:131328-131446

Candidates.#4....: $HEX[b9b985] -> $HEX[b9d1d1]

Hardware.Mon.#4..: Temp: 47c Fan:  0% Util: 99% Core:1987MHz Mem:3802MHz Bus:8

See how candidates are shown in HEX? even though my terminal is same encoding as charset file and password ?

but whenever I substitute ?1 with double ?1?1 per character (russian encoding is xxxx hex codes) I'm able to find password?!

Code:
>hashcat -d 4 -m 11300 -a 3 --custom-charset1 custom.[color=#333333][size=small][font=Tahoma, Verdana, Arial, sans-serif]hcchr[/font][/size][/color] bt_wallet_huy.hash ?1?1?1?1?1?1

Session..........: hashcat

Status...........: Cracked

Hash.Name........: Bitcoin/Litecoin wallet.dat

Hash.Target......: $bitcoin$96$79e09876f9674db8e947bb8713ea5f9a14dc1be...5d1b88

Time.Started.....: Mon Apr 19 14:53:53 2021 (3 secs)

Time.Estimated...: Mon Apr 19 14:53:56 2021 (0 secs)

Guess.Mask.......: ?1?1?1?1?1?1 [6]

Guess.Charset....: -1 custom.[color=#333333][size=small][font=Tahoma, Verdana, Arial, sans-serif]hcchr[/font][/size][/color], -2 Undefined, -3 Undefined, -4 Undefined

Guess.Queue......: 1/1 (100.00%)

Speed.#4.........:    1706 H/s (1.75ms) @ Accel:4 Loops:128 Thr:1024 Vec:1

Recovered........: 1/1 (100.00%) Digests

Progress.........: 6250/15625 (40.00%)

Rejected.........: 0/6250 (0.00%)

Restore.Point....: 0/3125 (0.00%)

Restore.Sub.#4...: Salt:0 Amplifier:1-2 Iteration:131328-131446

Candidates.#4....: $HEX[d1b98583d183] -> $HEX[d1d1d1d1d1d1]

Hardware.Mon.#4..: Temp: 52c Fan:  0% Util: 98% Core:1987MHz Mem:3802MHz Bus:8

So why mask behaves like I'm using --hex-charset when charset is actual characters?

Thanks

RE: Charset files don't behave as expected? - Snoopy - 04-20-2021

this is a basic misunderstanding how hashcat works when it comes to utf-8 or what else encoded chars

hashcat works on bytes sized style, so for example the german special char ä is in hex c3a4 which means 2 bytes long and therefore only crackable for example with ?b?b or ?1?1 (with a suitable given charset)

so for a human ä is just ONE char, but for hashcat it is always 2 bytes and if you want to attack it, your mask must be at least of size 2, masksize is not always equal to charcount (only for basic asciii)

each position of a mask is by default 1 byte so to crack

aä your mask must be of lenght 3 (in hex 61 c3a4)
ää your mask must be of lenght 4 (in hex c3a4 c3a4)

RE: Charset files don't behave as expected? - bbcjared - 04-20-2021

So when searching for 6 character password in Cyrillic using mask, I'd have to double that and now I'm searching for 12 character password?!? i.e. ?1?1?1?1?1?1?1?1?1?1?1?1

Then we have Chinese and Korean character sets that are 3 bytes long in UTF - E384B1 now Korean 6 symbol password becomes ?1?1?1?1?1?1?1?1?1?1?1?1?1?1?1?1?1?1 (18 character long) even though I'm looking for combination of 6 symbols from a charset.

RE: Charset files don't behave as expected? - CATuGHTI - 04-22-2021

Would be nice to get a reply from one of the devs if this is actually true, I'm especially interested in Korean charset and it would really suck if each character from the character set would take ?1?1?1 (three!) spots in the mask.

RE: Charset files don't behave as expected? - Chick3nman - 04-22-2021

(04-22-2021, 03:56 PM)CATuGHTI Wrote: Would be nice to get a reply from one of the devs if this is actually true, I'm especially interested in Korean charset and it would really suck if each character from the character set would take ?1?1?1 (three!) spots in the mask.

Characters that require multiple bytes are handled as multiple bytes. A 3 byte character will require 3 positions to create in a mask. Hashcat works on bytes for a number of reasons, we do not currently support "wide characters" as single things because they are not to the algorithms hashing them or the buffers storing them. There are tricks for cracking this sort of character efficiently, but all of them rely on clever attack designs not extended support in the kernels. I'm not sure if this will be something we look at implementing given the reasons we currently don't.

RE: Charset files don't behave as expected? - Snoopy - 04-23-2021

@bbcjared
@catughti

one of these "clever" tricks mentioned by chick3man to reduce keyspace
if you look carefully how eg. cyrillic ist encoded
https://www.utf8-chartable.de/unicode-utf8-table.pl?start=1024

all chars starting with d0, d1, d2, d3, d4 so you can modify/optimize you charset file to something like this

d0d1d2d3,d4?b,?1?2
d0d1d2d3d4,?b,?1?2?1?2
d0d1d2d3d4,?b,?1?2?1?2?1?2

to search for cyrillic 1char, 2 chars, 3 chars
you can optimized this further for the second byte part of the char if needed, dont forget to tell hashcat that charset is given in hex

for 3 bytes chars (korean) you have to modify the trick further, i hope you get the point how

RE: Charset files don't behave as expected? - CATuGHTI - 04-23-2021

@snoopy.

That is what I've been doing for Cyrillic and that's fine for 3 chars password for example:
(cyrillic.hcchr has list of all hex codex without d0, d1 prefix.. )

Code:
hashcat -d 4,5 -a 3 -m xx -1 D0D1 -2 cyrillic.hcchr --hex-charset hash.hash ?1?2?1?2?1?2

The problem is that it takes 3 spots in the mask for Korean, Chinese (any asian sets) and with max-mask 20 chars.. we are limited to 6 symbol passwords this way.

Also, its not this easy and fun in Korean as in Russian with char sets, I get a loooots of junk trying to combine 3 sets same way as with Russian..

Wish hashcat was just smart enough to put one char from .hcchr file and put it into place of ONE ?1

BTW, this behavior is not described in the manual https://hashcat.net/wiki/doku.php?id=mask_attack (Hashcat charset files ) and gives false impression that one char gets in place of ?1

Thanks for answer anyway Smile

RE: Charset files don't behave as expected? - Snoopy - 04-23-2021

why max 6 chars / max mask 20?

you could mask it again like
-1 D0D1 -2 cyrillic.hcchr -3 ?1?2 --hex-charset hash.hash ?3?3?3?3?3

korean and chinese, isnt it there one "char" a whole word?

maybe you are better with generating a wordlist with these chars and then going further with princeprocessor or something like this

RE: Charset files don't behave as expected? - CATuGHTI - 04-24-2021

@snoopy
Well, this was facepalm moment for me thanks much ! Smile

))

I can combine char lists, duh Smile

Code:
?3 = ?1+?2

As the the Korean, some words are 2 some 3, some 6 characters long . (just one word). also, the character changes and building up when you type, basically https://www.howtostudykorean.com/unit0/unit0lesson1/
I have no idea how they teach this to kids..

And as for wordlists.

There are about 2k characters:

Code:
cat Korean_utf8.txt |wc -l

1971

Combining those using combinator to get 3 characters long ls 72G already.

Code:
>ls -lh Korean_combined_double.txt

-rw-r--r-- 1 xxx xxx 72G Apr 17 17:43 Korean_combined_3_char.txt

>tail Korean_combined_3_char.txt

ㆆㆆ히

ㆆㆆ힉

ㆆㆆ힌

ㆆㆆ힐

Will try using price, maybe piping prince to hashcat.

Is this still considered good attack?

Code:
pp64 < wordlist.dict | hashcat [options] target.hash -r prince_optimized.rule

My biggest complain about piping anything to hashcat is being unable to save progress with --session

Thanks

RE: Charset files don't behave as expected? - Snoopy - 04-26-2021

regarding prince and session

you can use these options, to limit the putput of prince to a given number

--skip=0
--limit=5000000

and after the run

--skip=5000000
--limit=10000000

i use badges of 5 million for attacking a list with salted hashes, i get something around 2000 MH/s and took me round 3 days per round

dont bother the numbers of progress, there words + rules is counted (i think prince generated has somthing about 1500 rules) so every generated word will be multiplied by 1500)