Posts: 6
Threads: 1
Joined: Oct 2023
(10-07-2023, 06:28 AM)ManuB1G Wrote: From my point of view L40 will the best Datacenter GPU for Hashcat. It has more shaders (18176) than 4090, H100 and A100.
The L40S seems to have a higher clock (2490 MHz vs 2520 MHz) but the +50 W are a high price for not even +1%. 8 of them in a server are +400W, this is why we decided to use L40 because they are a third of H100 price and the fast memory of H100 is not a huge performance increase for hashcat. As far as I know the RTX A6000 Ada is close to the L40 but with a fan for workstation usage.
Thank you for sharing your assessment, it aligns with what I was thinking. Please share your performance benchmarks when you get your cards.
Posts: 412
Threads: 2
Joined: Dec 2015
10-09-2023, 08:12 PM
(This post was last modified: 10-09-2023, 10:51 PM by Chick3nman.)
(10-09-2023, 02:31 PM)starfish Wrote: Thanks Chick3nmann - the information from marc1n was getting somewhat contradictory.
My understanding is as follows, please correct where I am misinformed:- Hashcat will use as many GPUs as you give it
- Hashcat does not benefit from NVLink/NVSwitch
- Hashcat does not benefit from Tensor cores
- Hashcat does not benefit from RT cores
- The number and clock speed of CUDA/FP32 Cores are important to the performance
Regarding GPUs (using NVIDIA's definitions for datacentre and workstation)- The RTX 4090 is the best performing consumer card you have tested.
- The RTX 6000 Ada is the best performing workstation card and should be slightly faster than the RTX 4090 due to its increased number of cores (18,176 vs 16,384) and very similar clock speed.
- The L40 is the best performing 'datacentre' card and likely to perform similar to the RTX 6000 Ada.
As you note, other factors include card cost and availability of them in the desired server vendor.
You said you did some testing with the L40, do you have any benchmarks?
Many Thanks
Yeah, this all sounds correct. I would expect the RTX 6000 Ada and/or L40(S) to be around the same or slower than the 4090 though. The clock speeds on paper are NOT the clock speeds during operation. The consumer GPUs will boost significantly higher during usage than the professional cards and thus are usually a few percent faster in many algorithms, even if the professional card has more cores. Other than that, you are correct.
One thing to note on the NVLink/NVSwitch stuff. Not only do we not use it, but you will have to potentially disable or work around it in some cases where it may interfere, so it's best to avoid it in the first place if you only plan to use the cards for hashcat. I know on modern SXM GPUs that are using NVSwitch interlinks you must install the fabric manager and potentially configure some things to allow hashcat to function.
Posts: 930
Threads: 4
Joined: Jan 2015
12-17-2023, 05:30 AM
(This post was last modified: 12-17-2023, 05:30 AM by royce.)
FWIW, here are some L40 benchmarks from ManuB1G - coming in lower than 4090 but not bad, just as Chick3nman suggested.
https://hashcat.net/forum/thread-11732.html
~