NTLM Performance Problem
#1
Hello,
i have a little performance problem with cracking a NTLM hash.

My Hardware:
Disk: Samsung 970 Plus M.2
GPU: 2x RTX 2080 TI

With Benchmark i get this results:
Quote:Hashmode: 1000 - NTLM

Speed.#1.........: 92787.1 MH/s (24.13ms) @ Accel:128 Loops:1024 Thr:256 Vec:2
Speed.#2.........: 89742.9 MH/s (24.92ms) @ Accel:128 Loops:1024 Thr:256 Vec:2
Speed.#*.........:  182.5 GH/s


With my favorit attack-mode:
Quote:A Wordlist with only 12 words
hashcat64.exe -a 6 -w 4 -i --increment-min=5 NTLM-Hash PW_List.txt -1 ?d?l!@MLK-?? ?1?1?1?1?1?1?1
Speed.#1.........:  8555.0 kH/s (1.05ms) @ Accel:128 Loops:1024 Thr:256 Vec:1
Speed.#2.........:        0 H/s (0.00ms) @ Accel:128 Loops:1024 Thr:256 Vec:1

so i looked at https://hashcat.net/faq/morework and tryed the 1st version with a pipe to hashcat
Quote:hashcat64.exe --stdout -a 6 -i PW_List.txt -1 ?l?d??!@LKM ?1?1?1?1?1?1?1 | hashcat64.exe -m 1000 NTLM-Hash

Speed.#1.........:  306.5 kH/s (2.35ms) @ Accel:512 Loops:1 Thr:64 Vec:1
Speed.#2.........:  315.0 kH/s (2.39ms) @ Accel:512 Loops:1 Thr:64 Vec:1

pure wordlist attack is also not so fast as it looks in the benchmark results:
Quote:hashcat64.exe -O -a 0 -w 4 -m 1000 NTLM-Hash wordlist_600million.txt
Speed.#1.........:  5998.7 kH/s (0.39ms) @ Accel:128 Loops:1 Thr:256 Vec:1
Speed.#2.........:  5912.1 kH/s (0.39ms) @ Accel:128 Loops:1 Thr:256 Vec:1
Speed.#*.........: 11910.9 kH/s

edit://
now i make my wordlist a little bi bigger and the performance looks much better (but still a long way to 182.5 GH/s):
Quote:hashcat64.exe -a 6 -w 4  PW_List.txt -1 ?d?l!@MLK-?? ?1?1?1?1 --stdout -o PW_List_+4.txt

hashcat64.exe -a 6 -w 4 -i NTLM-Hash PW_List_+4.txt -1 ?d?l!@MLK-?? ?1?1?1?1?1?1?1


Guess.Mod........: Mask (?1?1?1?1) [4], Right Side
Guess.Charset....: -1 ?d?l!@MLK-??, -2 Undefined, -3 Undefined, -4 Undefined
Speed.#1.........: 24265.2 MH/s (90.51ms) @ Accel:128 Loops:1024 Thr:256 Vec:1
Speed.#2.........: 23676.4 MH/s (92.23ms) @ Accel:128 Loops:1024 Thr:256 Vec:1
Speed.#*.........: 47941.6 MH/s
Reply
#2
There are a few things going on here.

The very first thing i'd note would be the warnings/advice given when running a benchmark.
```

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.

```
Benchmark, by default, uses the optimized kernels, which restrict the max candidate length in order to increase speed. You should most likely be using the optimized kernels for your attacks so I would start by adding -O to everything you are running. Only your "pure wordlist" attack has it in place.

Now, this may not help all that much, and that's totally fine in most of the above cases. Benchmarking is done with a full workload, single hash, optimized brute force. It's meant to indicate the absolute maximum speed you can achieve against 1 hash with a brute force mask of something like ?a?a?a?a?a?a?a?a?a. The speed at which candidates can be generated and loaded on the GPU under those conditions is significantly faster than in other attacks, such as dictionary attacks, where candidates have to be loaded from the dictionary instead of being generated on the GPU. Rules and other operations that do candidate generation/modification on the GPU instead of the host are very fast and can be used to increase the performance/speed of a running job if applied on top of a normal straight(dictionary) attack. Your attempts to increase work to get more performance are working, as you noted in the last attack, however it is not really reasonable to expect mask/benchmark speeds out of a hybrid or wordlist attack. If you run a simple mask attack, -a 3 ?a?a?a?a?a?a?a?a?a, and find that it shows roughly the same speed as your benchmark, then there shouldnt be any issues with your setup and it just comes down to optimizing your workload for the best possible performance.
Reply
#3
Thanks.

Actual the PC is running with the last caracter-set for the next 24h, ...later i will have a look about "hashcat checkpoint funktion" to make a break, otherwise i will start to test some optimization tomorrow an will report the results here.

If pure bruteforce is much fast than my actual 47GB/s, i will write a little batch for my wordlist which sequentielly work on word from my list like:

hashcat63.exe ................. -1 ?d?l!@MLK-??  password_1?1?1?1?1?1?1?1?1?1?1
hashcat63.exe ................. -1 ?d?l!@MLK-??  password_2?1?1?1?1?1?1?1?1?1?1
hashcat63.exe ................. -1 ?d?l!@MLK-??  password_3?1?1?1?1?1?1?1?1?1?1
hashcat63.exe ................. -1 ?d?l!@MLK-??  password_4?1?1?1?1?1?1?1?1?1?1
hashcat63.exe ................. -1 ?d?l!@MLK-??  password_5?1?1?1?1?1?1?1?1?1?1
Reply
#4
That attack is not going to have good speed either, because static mask prefixes slow hashcat down. Generate a rules file for the mask part and do a wordlist+rules attack.
Reply
#5
(01-04-2020, 12:56 PM)undeath Wrote: That attack is not going to have good speed either, because static mask prefixes slow hashcat down. Generate a rules file for the mask part and do a wordlist+rules attack.

the bruteforce is also not as fast as a hoped:


Code:
hashcat64.exe -O -w 4 -a 3 -i -m 1000 Hash -1 ?d?l??!@-MLK xxx?1?1?1?1?1?1?1?1

Time.Estimated...: Sat Jan 04 14:46:26 2020 (6 mins, 14 secs)
Guess.Mask.......: XXXXXXXX?1?1?1?1?1?1?1 [14]
Guess.Charset....: -1 ?d?l??!@-MLK, -2 Undefined, -3 Undefined, -4 Undefined
Guess.Queue......: 14/15 (93.33%)
Speed.#1.........:  305.1 MH/s (0.47ms) @ Accel:128 Loops:1 Thr:256 Vec:2
Speed.#2.........:  291.7 MH/s (0.47ms) @ Accel:128 Loops:1 Thr:256 Vec:2
Speed.#*.........:  596.8 MH/s

I will try a wordlist+Rule format,...... i will post my resulst Wink
Reply
#6
The first time that i working with rules, but i think the rule will be fine:

Code:
mp64.exe -o test_4.rule -1 ?l?d??@!-MLK "$?1 $?1 $?1 $?1"

i started hashcat with:
Code:
hashcat64.exe -a 0 -w 4 -O -m 1000 HASH PW_List+4.txt -r test_4.rule

but my cracking speed is slower than Wordlist+Maskfile ;-/
Code:
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........: 18199.2 MH/s (28.69ms) @ Accel:128 Loops:256 Thr:256 Vec:1
Speed.#2.........: 17529.8 MH/s (29.37ms) @ Accel:128 Loops:256 Thr:256 Vec:1
Speed.#*.........: 35730.2 MH/s

My rule-file is ~44MB, my wordlist ~500MB

Quote:You can use it in your cracking session by setting the -O option.
Ah, i forgot this in the attack, now it runs much faster,...but still only 1/3 of the benchmark.
Is there any other trick to push the speed ?

Code:
Guess.Mod........: Mask (?1?1?1?1?1) [5], Right Side
Guess.Charset....: -1 ?d?l!@MLK-??, -2 Undefined, -3 Undefined, -4 Undefined
Speed.#1.........: 32800.0 MH/s (62.36ms) @ Accel:128 Loops:1024 Thr:256 Vec:1
Speed.#2.........: 29737.9 MH/s (64.31ms) @ Accel:128 Loops:1024 Thr:256 Vec:1
Speed.#*.........: 62539.4 MH/s
Reply
#7
the last days i changed my hadware-settings a little bit.
I sell one of my GPU´s and changed the CPU to an i7-9700K  - For future settings i need a better CPU Wink

So i´m going on to improve the NTLM hashing speed.

Benchmark: Speed.#1.........: 99231.6 MH/s
Bruteforce (a 3): Speed.#1.........: 70089.3 MH/s
Wordlist+Mask (a 6): Speed.#1.........: 21546.2 kH/s

Because of Wordlist+Mask-Attack is much slower as Bruteforce i tried to set the first 4 characters as a charset like this:
Code:
hashcat -O -w 4 -a 3 -m 1000 66D7AB6B09E29F9AA06B9FE593A1765F  -1 H -2 E -3 L -4 O ?1?2?3?4?a?a?a?a?a?a

Speed.#1.........:  256.9 MH/s

I expected i will get the same result as pure bruteforce with the whole charset.....any idea how to improve the speed?
-i often need NTLM, mostly a given Wordlist of ~20 candidates an a MAsk of 1-7 random characters
Reply
#8
Setting the prefix like that causes a serious degradation in performance. You definitely don't want to try to accomplish it that way. Wordlist+rules will probably be your best bet for the type of candidates you want to generate at the speeds to want. I don't expect it to be benchmark speeds by any means, but it will be faster than your hybrid or mask attempt.
Reply