Freeze when optimized kernels are used
#1
Hi there,

at first, thank you so much for this awesome piece of software!

I encountered a problem while attacking NTLM (mode 1000):

Brute Force attack with charset using the command: .\hashcat.exe -m 1000 -w 3 -O --increment --increment-min=1 hash.hsh -1 charset.hcchr ?1?1?1?1?1?1?1?1?1 -o pw.txt --status --status-timer=6000

I get about 300 GH/s with this command. But in these runs, the system completely freezes after a while and needs a reboot. After reboot, I can restore but get only less than 200 GH/s. I have to really shut down the OS and turn it back on to get the 300 GH/s again.

Errors in event log before the freeze are Event ID 14, "nvlddmkm", with the reference to "\Device\Video39".

When I omit the optimized kernel param (-O), everything works fine for days and hours but I only get 210 - 220 GH/s.

Does this sound like a power issue to you? Are the optimized kernels heavier load for the GPU's, or are they only more efficient?
I'm making up that there's a different noise of the rack when the freeze happens - like it's a bit more quiet from a card turning off or sth like that.

Hashcat 6.2.6 runs on a mining rig using a MSI 5310 F PRO and 4 x GTX 1080 Ti + 6 x RTX 2080 Ti. Windows 10 21H2.
Newest Nvidia drivers (536) and CUDA Toolkit (12.2.0).

I appreciate every hint. Only 70 % performance is fine but 300 GH/s would be cooler ...

Thank you,

Nico
Reply
#2
Sounds like overheating or some gpu is faulty.
Reply
#3
It sounds like one of your GPU's might be having issues, probably due to over-heating when you gain that extra 100 GH/s by using -O.
Are you using risers in that rig? That could give issues.
Also, what cooling solution are you using? If open air cooled, there should be enough space between the cards. If you're using blower style cards, make sure that the external fans creates enough airflow through the system.
Reply
#4
Thank you, mates!

Yes, I'm trying to mitigate which one is "/Device/Video39".
As far as "HKEY_LOCAL_MACHINE\HARDWARE\DEVICEMAP\VIDEO" tells, it must be one of the 2080s. But I didn't find a possibility to identify them in Device Manager or WMI yet.

Yes, PCIe risers are used and it's open air cooled in a room with A/C.
Reply
#5
When running with -O and you get the problem, you could try running nvidia-smi in another console to see which GPU that does not perform.
Reply
#6
Good to know, I will have a look on SMI.
I did a clean reinstall of the drivers as described in the FAQ.
Also one of the RTX 2080 Ti cards had a dead fan! The first of the three (near the video sockets) was stiff and didn't rotate anymore. I replaced the card with a replacement 2080. They are relatively "cheap" Zotac cards.

One of the 2080 Ti has as much as an influence of 40 GH/s. Kinda impressive! Will report if the failure persists.

Thank you very much!

Nico
Reply