Tesla K20m sha512crypt dictionary attack performance issues
#1
Hello,
I'm experiencing a puzzling (at least, for me) behaviour while performing a dictionary-only attack on a sha512crypt hash.

Scenario:
Hardware is a server with dual Xeon E5-2603v2 CPU, 32GB RAM, 4x Nvidia Tesla K20m with 5GB memory each.
Software is Linux Centos 6.5 64 bit, nvidia 319.37 driver from official Nvidia repos, oclHashcat 1.01.

SElinux is disabled and the system isn't doing anything else.

my command line is:

Code:
cudaHashcat64.bin --gpu-accel=1 --gpu-loops=1024 -m 1800 /root/tests/sha512.hashes /root/tests/60milliondict.txt

The sha512.hashes contains ten sha512 hashes (with different salts) and the dictionary contains 60 million 8-char-long passwords.

The gpu-accel and gpu-loops parameters were choosen through exhaustive benchmarking (= trying a set of possible gpu-accel and gpu-loops combos).

What happens is that I during bruteforcing I get 400% CPU usage; it's a dual core, 4-core-per CPU, non hyperthreaded machine, so thus 800% would mean the server is truly completely CPU saturated, but I still don't understand what's going on; I thought that dictionary-only attacks weren't really CPU bound, since there was no wordlist to generate on the fly.

There's something else bothering me as well: if I check the status while cracking, the data is something like that:

Quote:Speed.GPU.#1...: 628 H/s
Speed.GPU.#2...: 635 H/s
Speed.GPU.#3...: 635 H/s
Speed.GPU.#4...: 635 H/s
Speed.GPU.#*...: 2533 H/s
Recovered......: 0/10 (0.00%) Digests, 0/10 (0.00%) Salts
Progress.......: 692224/600000000 (0.12%)
Rejected.......: 0/692224 (0.00%)
HWMon.GPU.#1...: 99% Util, 36c Temp, -1% Fan
HWMon.GPU.#2...: 99% Util, 37c Temp, -1% Fan
HWMon.GPU.#3...: 99% Util, 39c Temp, -1% Fan
HWMon.GPU.#4...: 99% Util, 40c Temp, -1% Fan

While the GPU utilization is high, the machine actually seems able to crack 2500*10 hashes in the file = 25K/s password (consistent with the benchmark below), which is a bit low IMHO, since I saw posts like that:

http://hashcat.net/forum/archive/index.p...-2340.html

Where a dual Tesla K20m system seems to perform at the same level as mine. Tthe sha512crypt benchmark is not available in such page, but the sha512 benchmark is the same despite my system having 2x the GPUs of the posted one.

This is my sha512 + sha512crypt benchmark result :

Quote:cudaHashcat64.bin -b --benchmark-mode 1 -m 1700
cudaHashcat v1.01 starting in benchmark-mode...

Device #1: Tesla K20m, 4799MB, 705Mhz, 13MCU
Device #2: Tesla K20m, 4799MB, 705Mhz, 13MCU
Device #3: Tesla K20m, 4799MB, 705Mhz, 13MCU
Device #4: Tesla K20m, 4799MB, 705Mhz, 13MCU

Hashtype: SHA512
Workload: 128 loops, 256 accel

Speed.GPU.#1.: 50713.8 kH/s
Speed.GPU.#2.: 50854.4 kH/s
Speed.GPU.#3.: 51067.6 kH/s
Speed.GPU.#4.: 50880.3 kH/s
Speed.GPU.#*.: 203.5 MH/s

Quote:cudaHashcat v1.01 starting in benchmark-mode...

Device #1: Tesla K20m, 4799MB, 705Mhz, 13MCU
Device #2: Tesla K20m, 4799MB, 705Mhz, 13MCU
Device #3: Tesla K20m, 4799MB, 705Mhz, 13MCU
Device #4: Tesla K20m, 4799MB, 705Mhz, 13MCU

Hashtype: sha512crypt, SHA512(Unix)
Workload: 5000 loops, 8 accel

Speed.GPU.#1.: 6321 H/s
Speed.GPU.#2.: 6383 H/s
Speed.GPU.#3.: 6406 H/s
Speed.GPU.#4.: 6401 H/s
Speed.GPU.#*.: 25511 H/s

But I've noticed other strange behaviours:

- Using --cpu-affinity to limit CPU usage seems to lower the system load (e.g. top just shows 100% load) but the bruteforcing performance stays the same at about 25K H/s.
- Letting one single GPU device with -d to fully employ the CPUs doesnt' improve speed (the single GPU cracks about 6.3K hashes/s)


So, my questions are:

- Is it normal for the CPU usage to be that high?
- Might the system be CPU bound?
- Is there any way to improve my performance?

Thanks to anyone that can help me.


Messages In This Thread
Tesla K20m sha512crypt dictionary attack performance issues - by afra - 02-24-2014, 01:24 PM