Low hashrate with GTX 1650 when using CUDA
#1
Sad 
Hi!

So I finally did it, I robbed a bank and I purchased a brand new GPU for use with Hashcat. It's not a monster RTX model, but a modest GTX 1650. It's good enough for me, and it's much better than the GTX 560 Ti I was struggling with for the last two days.

You would think that buying a new GPU with up to date drivers will make your problems go away. Not in my case they didn't. I was still unable to run Hashcat 6.2.5.

Code:
hashcat (v6.2.5) starting

Successfully initialized NVIDIA CUDA library.

Failed to initialize NVIDIA RTC library.

* Device #1: CUDA SDK Toolkit not installed or incorrectly installed.
            CUDA SDK Toolkit required for proper device support and utilization.
            Falling back to OpenCL runtime.

* Device #1: WARNING! Kernel exec timeout is not disabled.
            This may cause "CL_OUT_OF_RESOURCES" or related errors.
            To disable the timeout, see: https://hashcat.net/q/timeoutpatch
OpenCL API (OpenCL 3.0 CUDA 11.5.121) - Platform #1 [NVIDIA Corporation]
========================================================================
* Device #1: NVIDIA GeForce GTX 1650, 3520/4095 MB (1023 MB allocatable), 14MCU

OpenCL API (OpenCL 2.0 AMD-APP (1800.11)) - Platform #2 [Advanced Micro Devices, Inc.]
======================================================================================
* Device #2: , skipped

Minimum password length supported by kernel: 0
Maximum password length supported by kernel: 256

Hashes: 7 digests; 7 unique digests, 1 unique salts
Bitmaps: 16 bits, 65536 entries, 0x0000ffff mask, 262144 bytes, 5/13 rotates

Optimizers applied:
* Zero-Byte
* Early-Skip
* Not-Salted
* Not-Iterated
* Single-Salt
* Brute-Force
* Raw-Hash

ATTENTION! Pure (unoptimized) backend kernels selected.
Pure kernels can crack longer passwords, but drastically reduce performance.
If you want to switch to optimized kernels, append -O to your commandline.
See the above message to find out about the exact limits.

Watchdog: Temperature abort trigger set to 90c

Initializing backend runtime for device #1. Please be patient...

I think I was being patient enough. I waited for 10 minutes before I aborted.

Before trying to run 6.2.5, I was running 3.0 with flying colors. I was getting up to 6500 MH/s, relying only on Open CL, no CUDA runtime. This is a huge improvement for my limited resources. An MD5 job that took 1 hour, 23 minutes and 29 seconds on GTX 560 Ti was now taking only 12 minutes and 22 seconds on GTX 1650. Compare that with 31 minutes 16 seconds on Radeon HD 6870, and 3 hour, 39 minutes and 15 seconds on Intel UHD 630. As you can see this is a big win for me.

I decided to start with 3.0 because I was having great success with that version using old GPUs. Then I moved on to 4.0 version and my hashrate started to decline, quite significantly. I was getting 470 MH/s at most. With version 5.0 I saw a good improvement, maxing it out at 1835 MH/s.

But version 6.2.5 was still beyond my reach. It doesn't work with GTX 560 Ti at all and it almost works with GTX 1650. For some reason it was failing to load  and 6.2.5 requires CUDA 11. So I downloaded and installed CUDA 11, but on top of CUDA 8. These can be installed side by side, right? I restarted the command shell and restarted Hashcat 6.25 and it worked. But with one noticeable difference, it was running at half the speed I was getting with Open CL. It is currently at work and hashing at about 3700 MH/s.

Maybe I am missing something, but isn't CUDA mode supposed to be faster than Open CL on Nvidia GPUs? That's the whole point of installing it, I presume.

Again, this is an MD5 job and it should be plow through it in mere minutes. I am using brute force mode, with a mask. I am also specifying an output file for convenience. That's it, no other options are enabled, just -a, -m, -o, hash file, and a mask.

Code:
CUDA API (CUDA 11.5)
====================
* Device #1: NVIDIA GeForce GTX 1650, 3327/4095 MB, 14MCU

OpenCL API (OpenCL 3.0 CUDA 11.5.121) - Platform #1 [NVIDIA Corporation]
========================================================================
* Device #2: NVIDIA GeForce GTX 1650, skipped

OpenCL API (OpenCL 2.0 AMD-APP (1800.11)) - Platform #2 [Advanced Micro Devices, Inc.]
======================================================================================
* Device #3: , skipped

Can someone explain to me the significance of these three info segments? Why is CUDA listed twice? How is "OpenCL 3.0 CUDA 11.5.121" CUDA different from "CUDA 11.5" CUDA? And why is the AMD CPU being skipped? Is it because it lacks iGPU? Can it still be put to work?

If Open CL API can be used to utilize CUDA, why do I have to install CUDA SDK toolkit? If Open CL is the fallback runtime like it says in the warning message, why does it not do what it says and fall back to using Open CL API for Open CL runtime rather than stalling when waiting for CUDA runtime to load? Logically, if CUDA runtime is not installed, or the wrong version is installed, the program should respond in some way or time out rather than just sitting there waiting for better times.

Please forgive me for the many questions, I don't mean to bother you, you probably have better things to do than to answer my silly noob questions. But I would greatly appreciate it if you found time to write me a line or two. At least to say hello.

Playfully yours,
meow
🐈
Reply


Messages In This Thread
Low hashrate with GTX 1650 when using CUDA - by meow - 12-20-2021, 11:37 PM