Benchmarking with 24GB ram usage
#1
Hello.

I'm using hashcat benchmark to test my 8x GPU nodes but the --benchmark option only uses 3-5GB ram maximum.

I have 24GB or 48GB ram on each gpu and I want to use all the ram available.

How can I do this?

Thanks
Reply
#2
Using more RAM wouldn't net any better speed. Hashcat takes what it needs. If you have a much larger hashlist or ruleset, it will naturally use more RAM, but just overprovisioning Hashcat with more RAM than it needs wouldn't give any benefit
Reply
#3
(6 hours ago)penguinkeeper Wrote: Using more RAM wouldn't net any better speed. Hashcat takes what it needs. If you have a much larger hashlist or ruleset, it will naturally use more RAM, but just overprovisioning Hashcat with more RAM than it needs wouldn't give any benefit

Hello Penguinkeeper. 

My goal is not the case you are thinking.
I'm using hashcat for stress testing my A.I cluster and I want to test all the available ram in the GPU's. 
Currently I use this command line to test in an endless loop, basicly it loads and removes data every 6 seconds.
"hashcat --benchmark -D 2 -m 1700 -n 4 -u 1024 -t 512 --quiet --force 2>/dev/null | grep -o "Speed.**"

With this method I just be sure the server is stable enough to run AI workloads.

Now I want to use hashcat to do basic gpu memory test with loading data to the all the sectors available.
Reply
#4
With this command line I can reach 190W and Memory-Usage:3036/24576MiB
I need to force 250W and also 24576MiB usage to be sure the system works stable.

hashcat -m 1700 -a 3 -w 4 -u 1024 --runtime=600 --force --potfile-disable "cf83e1357eefb8bdf1542850d66d8007d620e4050b87f8f1d9a2a56f1a37e4d9f1532e6bb5c0f58ef2fc118f2b0d6c7c19c3c9e9e7b9c0304a6c7cc801f8ee92" "?a?a?a?a?a?a?a?a

0: Tesla P40| 46C |GPU-Util: 100%|Pwr:181W/250W|Core: 1531MHz|Mclk: 3615MHz|Memory-Usage:3036/24576MiB|Thermal Slowdown: Not Active
1: Tesla P40| 58C |GPU-Util: 100%|Pwr:188W/250W|Core: 1531MHz|Mclk: 3615MHz|Memory-Usage:3036/24576MiB|Thermal Slowdown: Not Active
2: Tesla P40| 57C |GPU-Util: 100%|Pwr:185W/250W|Core: 1531MHz|Mclk: 3615MHz|Memory-Usage:3036/24576MiB|Thermal Slowdown: Not Active
3: Tesla P40| 53C |GPU-Util: 100%|Pwr:183W/250W|Core: 1531MHz|Mclk: 3615MHz|Memory-Usage:3036/24576MiB|Thermal Slowdown: Not Active
4: Tesla P40| 50C |GPU-Util: 100%|Pwr:191W/250W|Core: 1531MHz|Mclk: 3615MHz|Memory-Usage:3036/24576MiB|Thermal Slowdown: Not Active
5: Tesla P40| 49C |GPU-Util: 100%|Pwr:192W/250W|Core: 1531MHz|Mclk: 3615MHz|Memory-Usage:3036/24576MiB|Thermal Slowdown: Not Active
6: Tesla P40| 46C |GPU-Util: 100%|Pwr:187W/250W|Core: 1531MHz|Mclk: 3615MHz|Memory-Usage:3036/24576MiB|Thermal Slowdown: Not Active
7: Tesla P40| 53C |GPU-Util: 100%|Pwr:192W/250W|Core: 1531MHz|Mclk: 3615MHz|Memory-Usage:3036/24576MiB|Thermal Slowdown: Not Active
Reply
#5
although very stressful for hardware, hashcat isn't a stresstest programm, you can't tell hashcat how much ram it should use. the only way i can think of is providing a wordlist of ~ 200GB so hashcat can distribute this list to all cards, but not quite sure whether this will truly work this way.
Reply
#6
(5 hours ago)Snoopy Wrote: although very stressful for hardware, hashcat isn't a stresstest programm, you can't tell hashcat how much ram it should use. the only way i can think of is providing a wordlist of ~ 200GB so hashcat can distribute this list to all cards, but not quite sure whether this will truly work this way.

I know hashcat is not a stress tool but it generates a stress very easily and I like it.
I'm looking a way to use all the ram without providing 200GB list. 
It can be a duplication no problem for me. The only case is usage of all the available ram and if there is a bad sector then it will crash.
Reply
#7
Why not run an AI-specific stress test or loop a heavy AI workload? It can even just be relatively simple like running Llama 70B on loop or something. Hashcat stresses the GPU core, not too much the VRAM so it's just the incorrect tool
Reply
#8
(3 hours ago)penguinkeeper Wrote: Why not run an AI-specific stress test or loop a heavy AI workload? It can even just be relatively simple like running Llama 70B on loop or something. Hashcat stresses the GPU core, not too much the VRAM so it's just the incorrect tool

You are right. I just develop a script to test Core speed at the begining and the output is very simple to compare with other nodes but I just wanted to add Ram test option by simply increasing the ram consumption. 

Speed.#1.........:  1153.9 MH/s (54.16ms) @ Accel:4 Loops:1024 Thr:512 Vec:1
Speed.#2.........:  1149.8 MH/s (54.36ms) @ Accel:4 Loops:1024 Thr:512 Vec:1
Speed.#3.........:  1153.9 MH/s (54.16ms) @ Accel:4 Loops:1024 Thr:512 Vec:1
Speed.#4.........:  1153.9 MH/s (54.16ms) @ Accel:4 Loops:1024 Thr:512 Vec:1
Speed.#5.........:  1150.5 MH/s (54.34ms) @ Accel:4 Loops:1024 Thr:512 Vec:1
Speed.#6.........:  1154.0 MH/s (54.19ms) @ Accel:4 Loops:1024 Thr:512 Vec:1
Speed.#7.........:  1155.0 MH/s (54.12ms) @ Accel:4 Loops:1024 Thr:512 Vec:1
Speed.#8.........:  1151.9 MH/s (54.27ms) @ Accel:4 Loops:1024 Thr:512 Vec:1
Speed.#*.........:  9222.8 MH/s

With AI tools testing each gpu one by one and all together is not simple as hashcat. 
If do you have any recommendations ofc it will be awesome for me.
Reply