Hashcat CPU vs GPU - Linux vs Windows
#1
Hello everyone - I am sorry my English is poor Sad so I help myself through Google Translator
I have AMD Threadripper x1950 + GTX 1080TI + 64Gb RAM
System - 1 - Windows 10 64BIT on the NVME Samsung SSD
System - 2 - Linux 64BIT on a SATA drive

System - 2 - The-Distribution-Which-Does-Not-Handle-OpenCL-Well (Kali) LINUX

=======================================================================

Questions:

   1. how to compile Hashcat to use my hardware more efficiently (RAM, CPU)
- NVIDIA GPUs perform better on Windows
- AMD CPU better results under Linux (after installing Intel OpenCL drivers !!!)
- differences are the fault of the system, architecture or drivers?
 
  2. How to force the AMD Threadripper CPU to help GPU GTX 1080TI in password cracking?
- support by PP64, KWP64, others?
  3. OpenCL-INTEL (CPU) driver works better than original AMD under Windows? magic or a weak joke?
  4. How does Hashcat use memory access? memory access: NUMA (local) or UMA (distributed)?
  5. Does Hashcat use multi-core processors or does it leave it to the operating system?

I have narrowed down the tests to (Hashmode: 2500 - WPA-EAPOL-PBKDF2) to make my post more readable, I can paste everything if there's a need
 
Thank you for your help
  
=========================== TEST =============================

The-Distribution-Which-Does-Not-Handle-OpenCL-Well (Kali) LINUX OpenCL Pocl Project (default OpenCL)
OpenCL Platform #1: The pocl project (hashcat -b -m 2500 -O)
====================================
* Device #1: pthread-AMD Ryzen Threadripper 1950X 16-Core Processor, 16384/62315 MB allocatable, 32MCU
Benchmark relevant options:
===========================
* --optimized-kernel-enable
Hashmode: 2500 - WPA-EAPOL-PBKDF2 (Iterations: 4096)
Speed.#1.........:    10639 H/s (47.92ms) @ Accel:512 Loops:128 Thr:1 Vec:8

=======================================================================
=======================================================================

 The-Distribution-Which-Does-Not-Handle-OpenCL-Well (Kali) LINUX OpenCL INTEL  ( Hashcat -b -w 4 -O)
OpenCL Platform #1: Intel(R) Corporation
========================================
* Device #1: AMD Ryzen Threadripper 1950X 16-Core Processor, 16090/64363 MB allocatable, 32MCU
Benchmark relevant options:
===========================
* --opencl-device-types=1
* --optimized-kernel-enable
* --workload-profile=4

Hashmode: 2500 - WPA-EAPOL-PBKDF2 (Iterations: 4095)
Speed.#1.........:    27400 H/s (298.06ms) @ Accel:1024 Loops:1024 Thr:1 Vec:8

=======================================================================
=======================================================================

Windows10 - PowerShell - ADMIN
PS C:\Hashcat> .\hashcat64 -b -w 4 -D 1 -O
hashcat (v5.1.0) starting in benchmark mode...
OpenCL Platform #1: NVIDIA Corporation
======================================
* Device #1: GeForce GTX 1080 Ti, skipped.
OpenCL Platform #2: Advanced Micro Devices, Inc.
================================================
* Device #2: AMD Ryzen Threadripper 1950X 16-Core Processor, 16358/65432 MB allocatable, 32MCU
Benchmark relevant options:
===========================
* --opencl-device-types=1
* --optimized-kernel-enable
* --workload-profile=4
Hashmode: 2500 - WPA-EAPOL-PBKDF2 (Iterations: 4096)
Speed.#2.........:    25218 H/s (323.83ms) @ Accel:1024 Loops:1024 Thr:1 Vec:4

=======================================================================
=======================================================================

GPU + CPU (Windows)
PS C:\Hashcat> .\hashcat64 -b -w 4 -D 1,2 -O
hashcat (v5.1.0) starting in benchmark mode...
OpenCL Platform #1: NVIDIA Corporation
======================================
* Device #1: GeForce GTX 1080 Ti, 2816/11264 MB allocatable, 28MCU
OpenCL Platform #2: Advanced Micro Devices, Inc.
================================================
* Device #2: AMD Ryzen Threadripper 1950X 16-Core Processor, 16358/65432 MB allocatable, 32MCU
Benchmark relevant options:
===========================
* --opencl-device-types=1,2
* --optimized-kernel-enable
* --workload-profile=4

Hashmode: 2500 - WPA-EAPOL-PBKDF2 (Iterations: 4096)
Speed.#1.........:   637.9 kH/s (352.74ms) @ Accel:128 Loops:256 Thr:1024 Vec:1
Speed.#2.........:    22154 H/s (368.57ms) @ Accel:1024 Loops:1024 Thr:1 Vec:4
Speed.#*.........:   660.1 kH/s

=======================================================================
=======================================================================

GPU + CPU  (Linux + Intel OpenCL)
root@jacek:~# sudo hashcat -b -w 4 -D 1,2 -O
hashcat (v5.1.0-811-g5cddf527) starting in benchmark mode...
OpenCL Platform #1: Intel(R) Corporation
========================================
* Device #1: AMD Ryzen Threadripper 1950X 16-Core Processor, 16090/64363 MB allocatable, 32MCU
OpenCL Platform #2: NVIDIA Corporation
======================================
* Device #2: GeForce GTX 1080 Ti, 2793/11175 MB allocatable, 28MCU
Benchmark relevant options:
===========================
* --opencl-device-types=1,2
* --optimized-kernel-enable
* --workload-profile=4
Hashmode: 2500 - WPA-EAPOL-PBKDF2 (Iterations: 4095)
Speed.#1.........:    27142 H/s (301.02ms) @ Accel:1024 Loops:1024 Thr:1 Vec:8
Speed.#2.........:   579.6 kH/s (393.31ms) @ Accel:1024 Loops:1024 Thr:32 Vec:1
Speed.#*.........:   606.8 kH/s

=======================================================================
Reply
#2
1. Your GPU and CPU performance looks good, nothing to improve by compiling from source. If you want to compile from source on windows, read BUILD_msys.md or BUILD_cygwin.md you can find it on hashcats GitHub repository. Not sure what the problem is, your screenshots look good.
2. Use -D 1,2 but you did in the screenshot, so you already know?
3. Intel OpenCL runtime is better than AMD/POCL because it supports mapping OpenCL vector datatypes to the appropriate CPU instructions (SSE2, AVX2, etc).
4. Hashcat allocates memory using system alloc(), nothing fancy.
5. Yes, Intel OpenCL runtime maps OpenCL workitems to multi core
Reply
#3
Thank you very much that you found the time and willingness to answer, you helped me to understand many unknowns.
Reply