10-04-2013, 11:08 AM 
		
	
	
		Getting some interesting results for NETNTLMv1 ... It would seem hashcat with dictionary and rules is much faster then oclhashcat (brute or dic+rules) for large number of hashes. Also using hashcat with dictionary and rules is the only scenario where the speed actually increase when adding more (salted) hashes!
Can some one explain why I am seeing these results? Whats the bottleneck?
I understand that some optimization has been done to speed up NETNTLMv1 making use of the last 2 bytes DES chunk. i.e. you first crack the last DES chunk and then only check NTLMs which match that to get the resulting netntlm.
Has this optimization been done for both hashcat and oclHashcat and for all attack modes and could this partially explain these results.
Summary:
---------
hc-st-h5 ?? - 56MH/s
hc-st-h200 ?? - 92MH/s
hc-br-h5 -12MH/s
hc-br-h200 - 0.4MH/s
ocl15-st-h5 - 32MH/s
ocl15-st-h200 - 0.8MH/s
ocl15-br-h5 - 95MH/s
ocl15-br-h200 - 2.3MH/s
ocl14-st-h5 - 51MH/s
where:
-------
hc = hashcat v0.46
ocl15 = oclhashcat v0.15
ocl14 = oclhashcat v0.14
st = straight with rules
br = brute
h5 = 5 hashes
h200 = 200 hashes
?? = unexpected result
System:
CPU: Core i3 530
GPU: NVIDIA GTS250
hashlists:
in5: 5 hashs (salted - NETNTLMv1)
in200: 200 hashs (salted - NETNTLMv1)
hashcat v0.46
--------------
hashcat-cli64.exe -m5500 -a0 -c 1000 -n3 --remove --pw-min=8 -o out in5 ..\Dic\04 -r append4.rule
Speed/sec.: 56.54M plains, 61 words
hashcat-cli64.exe -m5500 -a0 -c 1000 -n3 --remove --pw-min=8 -o out in200 ..\Dic\04 -r append4.rule
Speed/sec.: 92.16M plains, - words
hashcat-cli64.exe -m5500 -a3 -c 1000 -n3 --remove --pw-min=8 -o out in5 ?u?l?l?l?d?d?d?s
Speed/sec.: - plains, 12.23M words
hashcat-cli64.exe -m5500 -a3 -c 1000 -n3 --remove --pw-min=8 -o out in200 ?u?l?l?l?d?d?d?s
Speed/sec.: - plains, 467.49k words
oclhashcat:
-----------
v0.15
cudaHashcat-plus64.exe -m5500 -a3 --gpu-temp-disable --remove -o out in5 ?u?l?l?l?d?d?d?s
Speed.GPU.#1...: 95837.1 kH/s
cudaHashcat-plus64.exe -m5500 -a3 --gpu-temp-disable --remove -o out in200 ?u?l?l?l?d?d?d?s
Speed.GPU.#1...: 2365.0 kH/s
cudaHashcat-plus64.exe -m5500 -a0 --gpu-temp-disable --remove -o out in5 ..\Dic\04 -r append4.rule
Speed.GPU.#1...: 32813.1 kH/s
cudaHashcat-plus64.exe -m5500 -a0 --gpu-temp-disable --remove -o out in200 ..\Dic\04 -r append4.rule
Speed.GPU.#1...: 816.9 kH/s
v0.14 (faster then v0.15 - Because of <15 char password optimization?)
cudaHashcat-plus64.exe -m5500 -a0 --gpu-temp-disable --remove -o out in5 ..\Dic\04 -r append4.rule
Speed.GPU.#1...: 51197.8k/s
	
	
	
	
Can some one explain why I am seeing these results? Whats the bottleneck?
I understand that some optimization has been done to speed up NETNTLMv1 making use of the last 2 bytes DES chunk. i.e. you first crack the last DES chunk and then only check NTLMs which match that to get the resulting netntlm.
Has this optimization been done for both hashcat and oclHashcat and for all attack modes and could this partially explain these results.
Summary:
---------
hc-st-h5 ?? - 56MH/s
hc-st-h200 ?? - 92MH/s
hc-br-h5 -12MH/s
hc-br-h200 - 0.4MH/s
ocl15-st-h5 - 32MH/s
ocl15-st-h200 - 0.8MH/s
ocl15-br-h5 - 95MH/s
ocl15-br-h200 - 2.3MH/s
ocl14-st-h5 - 51MH/s
where:
-------
hc = hashcat v0.46
ocl15 = oclhashcat v0.15
ocl14 = oclhashcat v0.14
st = straight with rules
br = brute
h5 = 5 hashes
h200 = 200 hashes
?? = unexpected result
System:
CPU: Core i3 530
GPU: NVIDIA GTS250
hashlists:
in5: 5 hashs (salted - NETNTLMv1)
in200: 200 hashs (salted - NETNTLMv1)
hashcat v0.46
--------------
hashcat-cli64.exe -m5500 -a0 -c 1000 -n3 --remove --pw-min=8 -o out in5 ..\Dic\04 -r append4.rule
Speed/sec.: 56.54M plains, 61 words
hashcat-cli64.exe -m5500 -a0 -c 1000 -n3 --remove --pw-min=8 -o out in200 ..\Dic\04 -r append4.rule
Speed/sec.: 92.16M plains, - words
hashcat-cli64.exe -m5500 -a3 -c 1000 -n3 --remove --pw-min=8 -o out in5 ?u?l?l?l?d?d?d?s
Speed/sec.: - plains, 12.23M words
hashcat-cli64.exe -m5500 -a3 -c 1000 -n3 --remove --pw-min=8 -o out in200 ?u?l?l?l?d?d?d?s
Speed/sec.: - plains, 467.49k words
oclhashcat:
-----------
v0.15
cudaHashcat-plus64.exe -m5500 -a3 --gpu-temp-disable --remove -o out in5 ?u?l?l?l?d?d?d?s
Speed.GPU.#1...: 95837.1 kH/s
cudaHashcat-plus64.exe -m5500 -a3 --gpu-temp-disable --remove -o out in200 ?u?l?l?l?d?d?d?s
Speed.GPU.#1...: 2365.0 kH/s
cudaHashcat-plus64.exe -m5500 -a0 --gpu-temp-disable --remove -o out in5 ..\Dic\04 -r append4.rule
Speed.GPU.#1...: 32813.1 kH/s
cudaHashcat-plus64.exe -m5500 -a0 --gpu-temp-disable --remove -o out in200 ..\Dic\04 -r append4.rule
Speed.GPU.#1...: 816.9 kH/s
v0.14 (faster then v0.15 - Because of <15 char password optimization?)
cudaHashcat-plus64.exe -m5500 -a0 --gpu-temp-disable --remove -o out in5 ..\Dic\04 -r append4.rule
Speed.GPU.#1...: 51197.8k/s
 
 

 


 ... its a 4 character only dictionary not GBs in size like we are accustomed to using. But its was only used to make a point in this case. The point that oclhashcat "brute force" for NETNTLM v1 is slower then hashcat with dic+rules. So this is not a question of saturating the GPU.
... its a 4 character only dictionary not GBs in size like we are accustomed to using. But its was only used to make a point in this case. The point that oclhashcat "brute force" for NETNTLM v1 is slower then hashcat with dic+rules. So this is not a question of saturating the GPU.  ... So coming back to topic the comparison is still valid in my eyes. Note: CPU is the old original Core i3 (nobody laugh). My use of these tools is more academic then practical. Besides even today smart rules and dictionaries still crack 50% of all hashes.
 ... So coming back to topic the comparison is still valid in my eyes. Note: CPU is the old original Core i3 (nobody laugh). My use of these tools is more academic then practical. Besides even today smart rules and dictionaries still crack 50% of all hashes.