I think the 1060 is definitely your best bet. Here is a benchmark of my GTX 1060 6GB (Driver is currently not uptodate, but it should give you a good idea) :
Note that the 6GB variant does also have slightly more streaming processors than the 3GB, so the performance will be a bit less.
Might be also interesting for you: https://hashcat.net/forum/thread-7765.html
Code:
hashcat (v5.1.0) starting in benchmark mode...
Benchmarking uses hand-optimized kernel code by default.
You can use it in your cracking session by setting the -O option.
Note: Using optimized kernel code limits the maximum supported password length.
To disable the optimized kernel code in benchmark mode, use the -w option.
* Device #1: WARNING! Kernel exec timeout is not disabled.
This may cause "CL_OUT_OF_RESOURCES" or related errors.
To disable the timeout, see: https://hashcat.net/q/timeoutpatch
OpenCL Platform #1: NVIDIA Corporation
======================================
* Device #1: GeForce GTX 1060 6GB, 1536/6144 MB allocatable, 10MCU
Benchmark relevant options:
===========================
* --optimized-kernel-enable
Hashmode: 0 - MD5
Speed.#1.........: 9655.7 MH/s (68.75ms) @ Accel:1024 Loops:256 Thr:256 Vec:1
Hashmode: 100 - SHA1
Speed.#1.........: 3688.0 MH/s (90.32ms) @ Accel:512 Loops:256 Thr:256 Vec:1
Hashmode: 1400 - SHA2-256
Speed.#1.........: 1351.2 MH/s (61.51ms) @ Accel:256 Loops:128 Thr:256 Vec:1
Hashmode: 1700 - SHA2-512
Speed.#1.........: 470.9 MH/s (88.59ms) @ Accel:256 Loops:64 Thr:256 Vec:1
Hashmode: 2500 - WPA-EAPOL-PBKDF2 (Iterations: 4096)
Speed.#1.........: 175.6 kH/s (57.96ms) @ Accel:256 Loops:64 Thr:256 Vec:1
Hashmode: 1000 - NTLM
Speed.#1.........: 16901.8 MH/s (78.58ms) @ Accel:1024 Loops:512 Thr:256 Vec:1
Hashmode: 3000 - LM
Speed.#1.........: 9599.9 MH/s (69.40ms) @ Accel:256 Loops:1024 Thr:256 Vec:1
Hashmode: 5500 - NetNTLMv1 / NetNTLMv1+ESS
Speed.#1.........: 9692.7 MH/s (68.38ms) @ Accel:1024 Loops:256 Thr:256 Vec:1
Hashmode: 5600 - NetNTLMv2
Speed.#1.........: 753.1 MH/s (55.29ms) @ Accel:256 Loops:64 Thr:256 Vec:1
Hashmode: 1500 - descrypt, DES (Unix), Traditional DES
Speed.#1.........: 413.1 MH/s (50.31ms) @ Accel:8 Loops:1024 Thr:256 Vec:1
Hashmode: 500 - md5crypt, MD5 (Unix), Cisco-IOS $1$ (MD5) (Iterations: 1000)
Speed.#1.........: 4362.7 kH/s (69.37ms) @ Accel:1024 Loops:1000 Thr:32 Vec:1
Hashmode: 3200 - bcrypt $2*$, Blowfish (Unix) (Iterations: 32)
Speed.#1.........: 7800 H/s (39.82ms) @ Accel:16 Loops:8 Thr:8 Vec:1
Hashmode: 1800 - sha512crypt $6$, SHA512 (Unix) (Iterations: 5000)
Speed.#1.........: 64347 H/s (62.78ms) @ Accel:512 Loops:128 Thr:32 Vec:1
Hashmode: 7500 - Kerberos 5 AS-REQ Pre-Auth etype 23
Speed.#1.........: 149.8 MH/s (69.63ms) @ Accel:256 Loops:64 Thr:64 Vec:1
Hashmode: 13100 - Kerberos 5 TGS-REP etype 23
Speed.#1.........: 148.7 MH/s (70.14ms) @ Accel:256 Loops:64 Thr:64 Vec:1
Hashmode: 15300 - DPAPI masterkey file v1 (Iterations: 23999)
Speed.#1.........: 30273 H/s (57.32ms) @ Accel:256 Loops:64 Thr:256 Vec:1
Hashmode: 15900 - DPAPI masterkey file v2 (Iterations: 7999)
Speed.#1.........: 21634 H/s (59.68ms) @ Accel:256 Loops:128 Thr:32 Vec:1
Hashmode: 7100 - macOS v10.8+ (PBKDF2-SHA512) (Iterations: 35000)
Speed.#1.........: 5368 H/s (55.40ms) @ Accel:128 Loops:32 Thr:256 Vec:1
Hashmode: 11600 - 7-Zip (Iterations: 524288)
Speed.#1.........: 4257 H/s (74.59ms) @ Accel:512 Loops:128 Thr:256 Vec:1
Hashmode: 12500 - RAR3-hp (Iterations: 262144)
Speed.#1.........: 19661 H/s (64.79ms) @ Accel:8 Loops:16384 Thr:256 Vec:1
Hashmode: 13000 - RAR5 (Iterations: 32767)
Speed.#1.........: 17069 H/s (74.45ms) @ Accel:256 Loops:64 Thr:256 Vec:1
Hashmode: 6211 - TrueCrypt PBKDF2-HMAC-RIPEMD160 + XTS 512 bit (Iterations: 2000)
Speed.#1.........: 114.9 kH/s (77.80ms) @ Accel:128 Loops:64 Thr:256 Vec:1
Hashmode: 13400 - KeePass 1 (AES/Twofish) and KeePass 2 (AES) (Iterations: 6000)
Speed.#1.........: 64510 H/s (105.32ms) @ Accel:512 Loops:256 Thr:32 Vec:1
Hashmode: 6800 - LastPass + LastPass sniffed (Iterations: 500)
Speed.#1.........: 1054.6 kH/s (73.71ms) @ Accel:128 Loops:125 Thr:256 Vec:1
Hashmode: 11300 - Bitcoin/Litecoin wallet.dat (Iterations: 199999)
Speed.#1.........: 1996 H/s (52.22ms) @ Accel:128 Loops:64 Thr:256 Vec:1
Started: Mon Mar 18 19:42:54 2019
Stopped: Mon Mar 18 19:48:43 2019
Note that the 6GB variant does also have slightly more streaming processors than the 3GB, so the performance will be a bit less.
Might be also interesting for you: https://hashcat.net/forum/thread-7765.html