GTX 1660ti, CUDA, slow hash rate for some algos
#1
I haven't found any benchmarks for the GTX 1660ti but there's a forum post that says it should be comparable to GTX 1070. When I benchmark WPA, bcrypt and LUKS I get similar results to 1070 benchmarks posted online. When I crack actual hashes WPA results are similar to benchmark but bcrypt and LUKS just crawl along. For example, bcrypt benchmarks at 12000 H/s but my hashes run at 115 H/s.

I'm running latest hashcat from github on Ubuntu 18.04 LTS.
NVIDIA Driver Version: 435.21

Hopefully someone can give me a tip. Thanks!


Code:
user@linux:~$ hashcat -I
hashcat (v5.1.0-1447-gc4dd0206) starting...

CUDA Info:
==========

CUDA.Version.: 10.1

Backend Device ID #1 (Alias: #3)
  Name...........: GeForce GTX 1660 Ti
  Processor(s)...: 24
  Clock..........: 1455
  Memory.Total...: 5944 MB
  Memory.Free....: 5391 MB

OpenCL Info:
============

OpenCL Platform ID #1
  Vendor..: Intel(R) Corporation
  Name....: Intel(R) CPU Runtime for OpenCL(TM) Applications
  Version.: OpenCL 2.1 LINUX

  Backend Device ID #2
    Type...........: CPU
    Vendor.ID......: 8
    Vendor.........: Intel(R) Corporation
    Name...........: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
    Version........: OpenCL 2.1 (Build 0)
    Processor(s)...: 12
    Clock..........: 2600
    Memory.Total...: 15879 MB (limited to 3969 MB allocatable in one block)
    Memory.Free....: 15815 MB
    OpenCL.Version.: OpenCL C 2.0
    Driver.Version.: 18.1.0.0920

OpenCL Platform ID #2
  Vendor..: NVIDIA Corporation
  Name....: NVIDIA CUDA
  Version.: OpenCL 1.2 CUDA 10.1.0

  Backend Device ID #3 (Alias: #1)
    Type...........: GPU
    Vendor.ID......: 32
    Vendor.........: NVIDIA Corporation
    Name...........: GeForce GTX 1660 Ti
    Version........: OpenCL 1.2 CUDA
    Processor(s)...: 24
    Clock..........: 1455
    Memory.Total...: 5944 MB (limited to 1486 MB allocatable in one block)
    Memory.Free....: 5376 MB
    OpenCL.Version.: OpenCL C 1.2
    Driver.Version.: 435.21


Code:
user@linux:~$ hashcat -m 3200 -b
hashcat (v5.1.0-1447-gc4dd0206) starting in benchmark mode...

Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.

/usr/local/share/hashcat/OpenCL/m03200-optimized.cl: Optimized kernel requested but not needed - falling back to pure kernel
* Device #1: WARNING! Kernel exec timeout is not disabled.
            This may cause "CL_OUT_OF_RESOURCES" or related errors.
            To disable the timeout, see: https://hashcat.net/q/timeoutpatch
* Device #3: WARNING! Kernel exec timeout is not disabled.
            This may cause "CL_OUT_OF_RESOURCES" or related errors.
            To disable the timeout, see: https://hashcat.net/q/timeoutpatch
nvmlDeviceGetFanSpeed(): Not Supported

CUDA API (CUDA 10.1)
====================
* Device #1: GeForce GTX 1660 Ti, 5391/5944 MB, 24MCU

OpenCL API (OpenCL 2.1 LINUX) - Platform #1 [Intel(R) Corporation]
==================================================================
* Device #2: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz, skipped

OpenCL API (OpenCL 1.2 CUDA 10.1.0) - Platform #2 [NVIDIA Corporation]
======================================================================
* Device #3: GeForce GTX 1660 Ti, skipped

Benchmark relevant options:
===========================
* --optimized-kernel-enable

Hashmode: 3200 - bcrypt $2*$, Blowfish (Unix) (Iterations: 32)

Speed.#1.........:    12822 H/s (43.13ms) @ Accel:2 Loops:32 Thr:12 Vec:1

Started: Mon Nov 18 17:36:01 2019
Stopped: Mon Nov 18 17:36:06 2019


Code:
user@linux:~$ hashcat -m 3200 -a 0 -O -w 4 testhash.txt ~/Documents/wordlists/rockyou.txt
hashcat (v5.1.0-1447-gc4dd0206) starting...

/usr/local/share/hashcat/OpenCL/m03200-optimized.cl: Optimized kernel requested but not needed - falling back to pure kernel
* Device #1: WARNING! Kernel exec timeout is not disabled.
            This may cause "CL_OUT_OF_RESOURCES" or related errors.
            To disable the timeout, see: https://hashcat.net/q/timeoutpatch
* Device #3: WARNING! Kernel exec timeout is not disabled.
            This may cause "CL_OUT_OF_RESOURCES" or related errors.
            To disable the timeout, see: https://hashcat.net/q/timeoutpatch
nvmlDeviceGetFanSpeed(): Not Supported

CUDA API (CUDA 10.1)
====================
* Device #1: GeForce GTX 1660 Ti, 5391/5944 MB, 24MCU

OpenCL API (OpenCL 2.1 LINUX) - Platform #1 [Intel(R) Corporation]
==================================================================
* Device #2: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz, skipped

OpenCL API (OpenCL 1.2 CUDA 10.1.0) - Platform #2 [NVIDIA Corporation]
======================================================================
* Device #3: GeForce GTX 1660 Ti, skipped

/usr/local/share/hashcat/OpenCL/m03200-optimized.cl: Optimized kernel requested but not needed - falling back to pure kernel
Hashes: 1 digests; 1 unique digests, 1 unique salts
Bitmaps: 16 bits, 65536 entries, 0x0000ffff mask, 262144 bytes, 5/13 rotates
Rules: 1

Applicable optimizers:
* Zero-Byte
* Single-Hash
* Single-Salt

Minimum password length supported by kernel: 0
Maximum password length supported by kernel: 72

Watchdog: Temperature abort trigger set to 90c

Host memory required for this attack: 143 MB

Dictionary cache hit:
* Filename..: /home/user/Documents/wordlists/rockyou.txt
* Passwords.: 14344389
* Bytes.....: 139921525
* Keyspace..: 14344389

[s]tatus [p]ause [b]ypass [c]heckpoint [q]uit => s

Session..........: hashcat
Status...........: Running
Hash.Name........: bcrypt $2*$, Blowfish (Unix)
Hash.Target......: $2b$12$s6iCykoyVsvksmXofX8gReLIpYpdJYrnh1tmGZGac9Fa...gv5pDq
Time.Started.....: Mon Nov 18 17:38:07 2019 (8 secs)
Time.Estimated...: Wed Nov 20 04:09:52 2019 (1 day, 10 hours)
Guess.Base.......: File (/home/user/Documents/wordlists/rockyou.txt)
Guess.Queue......: 1/1 (100.00%)
Speed.#1.........:      115 H/s (155.79ms) @ Accel:1 Loops:256 Thr:12 Vec:1
Recovered........: 0/1 (0.00%) Digests
Progress.........: 864/14344389 (0.01%)
Rejected.........: 0/864 (0.00%)
Restore.Point....: 864/14344389 (0.01%)
Restore.Sub.#1...: Salt:0 Amplifier:0-1 Iteration:768-1024
Candidates.#1....: lipgloss -> tyler1
Hardware.Mon.#1..: Temp: 59c Util:100% Core:1890MHz Mem:6000MHz Bus:16
Reply
#2
I'm guessing it has to do with the Iteration Count. Benchmark has a defaulted iteration count which obviously can vary based on the hash provided. 

Try the example hash and see if it also produces the same rates, mine seemed to have no issues between the benchmark and the example hash using a bruteforce attack.

$2a$05$LhayLxezLhK1LhWvKxCyLOj0j1u.Kj0jZ0pEmm134uzrQlFvQJLF6

Password is hashcat

Code:
hashcat64.exe  -m 3200 -a 3 $2a$05$LhayLxezLhK1LhWvKxCyLOj0j1u.Kj0jZ0pEmm134uzrQlFvQJLF6 has?l?l?l?l

Edit: Also tested with rockyou.txt and no drop in speed from benchmark.
Reply
#3
(11-19-2019, 02:46 AM)slyexe Wrote: I'm guessing it has to do with the Iteration Count. Benchmark has a defaulted iteration count which obviously can vary based on the hash provided.

You're exactly right. My hash has 12 rounds compared to your 5. I had no idea the iterations parameter was logarithmic.

Thank you
Reply