04-03-2024, 02:59 PM
Hi All, wondering if anyone has been able to successfully run Hashcat using the HIP API on AMD MI250 GPU's. I can get Hashcat to run over opencl, but performance seems to be a bit slow. When using HIP I get compile errors when attempting to run. Currently running on RHEL 8.9 for operating system using the latest version of the amdgpu-install also running Hashcat version 6.2.6 as well as building from source I end up with the same error messages. Any thoughts on what I'm getting for OpenCL performance? Could I expect to see better performance if I were to get HIP working? Thanks in advance!
This is the approach I'm using for the AMD rocm driver install
# amdgpu-install --usecase=rocm,hiplibsdk,opencl
Here is the full output of benchmark:
# ./hashcat -b -w4 -m1000
hashcat (v6.2.6-850-gfafb277e0) starting in benchmark mode
The device #9 specifically listed was skipped because it is an alias of device #1
The device #10 specifically listed was skipped because it is an alias of device #2
The device #11 specifically listed was skipped because it is an alias of device #3
The device #12 specifically listed was skipped because it is an alias of device #4
The device #13 specifically listed was skipped because it is an alias of device #5
The device #14 specifically listed was skipped because it is an alias of device #6
The device #15 specifically listed was skipped because it is an alias of device #7
The device #16 specifically listed was skipped because it is an alias of device #8
/sys/bus/pci/devices/0000:31:00.0/hwmon/hwmon3/pwm1: No such file or directory
/sys/bus/pci/devices/0000:34:00.0/hwmon/hwmon4/pwm1: No such file or directory
/sys/bus/pci/devices/0000:11:00.0/hwmon/hwmon5/pwm1: No such file or directory
/sys/bus/pci/devices/0000:14:00.0/hwmon/hwmon6/pwm1: No such file or directory
/sys/bus/pci/devices/0000:ae:00.0/hwmon/hwmon7/pwm1: No such file or directory
/sys/bus/pci/devices/0000:b3:00.0/hwmon/hwmon8/pwm1: No such file or directory
/sys/bus/pci/devices/0000:8e:00.0/hwmon/hwmon9/pwm1: No such file or directory
/sys/bus/pci/devices/0000:93:00.0/hwmon/hwmon10/pwm1: No such file or directory
HIP API (HIP 6.0.32831)
=======================
* Device #1: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #2: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #3: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #4: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #5: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #6: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #7: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #8: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
OpenCL API (OpenCL 2.1 AMD-APP (3602.0)) - Platform #1 [Advanced Micro Devices, Inc.]
=====================================================================================
* Device #9: AMD Instinct MI250X/MI250, skipped
* Device #10: AMD Instinct MI250X/MI250, skipped
* Device #11: AMD Instinct MI250X/MI250, skipped
* Device #12: AMD Instinct MI250X/MI250, skipped
* Device #13: AMD Instinct MI250X/MI250, skipped
* Device #14: AMD Instinct MI250X/MI250, skipped
* Device #15: AMD Instinct MI250X/MI250, skipped
* Device #16: AMD Instinct MI250X/MI250, skipped
Benchmark relevant options:
===========================
* --backend-devices-virtual=1
* --workload-profile=4
-----------------------
* Hash-Mode 1000 (NTLM)
-----------------------
hiprtcCompileProgram(): HIPRTC_ERROR_COMPILATION
lld: error: undefined hidden symbol: __ockl_get_group_id
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_memset)
>>> referenced 7 more times
lld: error: undefined hidden symbol: __ockl_get_local_size
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_memset)
>>> referenced 7 more times
lld: error: undefined hidden symbol: __ockl_get_local_id
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_memset)
>>> referenced 7 more times
* Device #1: Kernel /root/hashcat/OpenCL/shared.cl build failed.
* Device #1: Kernel /root/hashcat/OpenCL/shared.cl build failed.
Started: Tue Apr 2 16:25:25 2024
Stopped: Tue Apr 2 16:25:27 2024
If I install without hip and just use opencl this is the performance I'm seeing.
# amdgpu-install --usecase=opencl
[root@gpu002 1]# # /scratch/hashtopolis/crackers/1/hashcat.bin -b -w4 -m1000
hashcat (v6.2.6) starting in benchmark mode
/sys/bus/pci/devices/0000:31:00.0/hwmon/hwmon3/pwm1: No such file or directory
/sys/bus/pci/devices/0000:34:00.0/hwmon/hwmon4/pwm1: No such file or directory
/sys/bus/pci/devices/0000:11:00.0/hwmon/hwmon5/pwm1: No such file or directory
/sys/bus/pci/devices/0000:14:00.0/hwmon/hwmon6/pwm1: No such file or directory
/sys/bus/pci/devices/0000:ae:00.0/hwmon/hwmon7/pwm1: No such file or directory
/sys/bus/pci/devices/0000:b3:00.0/hwmon/hwmon8/pwm1: No such file or directory
/sys/bus/pci/devices/0000:8e:00.0/hwmon/hwmon9/pwm1: No such file or directory
/sys/bus/pci/devices/0000:93:00.0/hwmon/hwmon10/pwm1: No such file or directory
OpenCL API (OpenCL 2.1 AMD-APP (3602.0)) - Platform #1 [Advanced Micro Devices, Inc.]
=====================================================================================
* Device #1: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #2: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #3: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #4: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #5: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #6: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #7: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #8: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
Benchmark relevant options:
===========================
* --workload-profile=4
-----------------------
* Hash-Mode 1000 (NTLM)
-----------------------
Speed.#1.........: 9333.3 MH/s (93.23ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#2.........: 9431.5 MH/s (92.30ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#3.........: 9415.0 MH/s (92.50ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#4.........: 9549.7 MH/s (91.21ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#5.........: 9576.7 MH/s (90.92ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#6.........: 9520.6 MH/s (91.46ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#7.........: 9513.1 MH/s (91.52ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#8.........: 9377.8 MH/s (92.81ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#*.........: 75717.6 MH/s
Started: Tue Apr 2 09:28:28 2024
Stopped: Tue Apr 2 09:28:38 2024
This is the approach I'm using for the AMD rocm driver install
# amdgpu-install --usecase=rocm,hiplibsdk,opencl
Here is the full output of benchmark:
# ./hashcat -b -w4 -m1000
hashcat (v6.2.6-850-gfafb277e0) starting in benchmark mode
The device #9 specifically listed was skipped because it is an alias of device #1
The device #10 specifically listed was skipped because it is an alias of device #2
The device #11 specifically listed was skipped because it is an alias of device #3
The device #12 specifically listed was skipped because it is an alias of device #4
The device #13 specifically listed was skipped because it is an alias of device #5
The device #14 specifically listed was skipped because it is an alias of device #6
The device #15 specifically listed was skipped because it is an alias of device #7
The device #16 specifically listed was skipped because it is an alias of device #8
/sys/bus/pci/devices/0000:31:00.0/hwmon/hwmon3/pwm1: No such file or directory
/sys/bus/pci/devices/0000:34:00.0/hwmon/hwmon4/pwm1: No such file or directory
/sys/bus/pci/devices/0000:11:00.0/hwmon/hwmon5/pwm1: No such file or directory
/sys/bus/pci/devices/0000:14:00.0/hwmon/hwmon6/pwm1: No such file or directory
/sys/bus/pci/devices/0000:ae:00.0/hwmon/hwmon7/pwm1: No such file or directory
/sys/bus/pci/devices/0000:b3:00.0/hwmon/hwmon8/pwm1: No such file or directory
/sys/bus/pci/devices/0000:8e:00.0/hwmon/hwmon9/pwm1: No such file or directory
/sys/bus/pci/devices/0000:93:00.0/hwmon/hwmon10/pwm1: No such file or directory
HIP API (HIP 6.0.32831)
=======================
* Device #1: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #2: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #3: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #4: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #5: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #6: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #7: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #8: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
OpenCL API (OpenCL 2.1 AMD-APP (3602.0)) - Platform #1 [Advanced Micro Devices, Inc.]
=====================================================================================
* Device #9: AMD Instinct MI250X/MI250, skipped
* Device #10: AMD Instinct MI250X/MI250, skipped
* Device #11: AMD Instinct MI250X/MI250, skipped
* Device #12: AMD Instinct MI250X/MI250, skipped
* Device #13: AMD Instinct MI250X/MI250, skipped
* Device #14: AMD Instinct MI250X/MI250, skipped
* Device #15: AMD Instinct MI250X/MI250, skipped
* Device #16: AMD Instinct MI250X/MI250, skipped
Benchmark relevant options:
===========================
* --backend-devices-virtual=1
* --workload-profile=4
-----------------------
* Hash-Mode 1000 (NTLM)
-----------------------
hiprtcCompileProgram(): HIPRTC_ERROR_COMPILATION
lld: error: undefined hidden symbol: __ockl_get_group_id
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_memset)
>>> referenced 7 more times
lld: error: undefined hidden symbol: __ockl_get_local_size
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_memset)
>>> referenced 7 more times
lld: error: undefined hidden symbol: __ockl_get_local_id
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.ogpu_memset)
>>> referenced 7 more times
* Device #1: Kernel /root/hashcat/OpenCL/shared.cl build failed.
* Device #1: Kernel /root/hashcat/OpenCL/shared.cl build failed.
Started: Tue Apr 2 16:25:25 2024
Stopped: Tue Apr 2 16:25:27 2024
If I install without hip and just use opencl this is the performance I'm seeing.
# amdgpu-install --usecase=opencl
[root@gpu002 1]# # /scratch/hashtopolis/crackers/1/hashcat.bin -b -w4 -m1000
hashcat (v6.2.6) starting in benchmark mode
/sys/bus/pci/devices/0000:31:00.0/hwmon/hwmon3/pwm1: No such file or directory
/sys/bus/pci/devices/0000:34:00.0/hwmon/hwmon4/pwm1: No such file or directory
/sys/bus/pci/devices/0000:11:00.0/hwmon/hwmon5/pwm1: No such file or directory
/sys/bus/pci/devices/0000:14:00.0/hwmon/hwmon6/pwm1: No such file or directory
/sys/bus/pci/devices/0000:ae:00.0/hwmon/hwmon7/pwm1: No such file or directory
/sys/bus/pci/devices/0000:b3:00.0/hwmon/hwmon8/pwm1: No such file or directory
/sys/bus/pci/devices/0000:8e:00.0/hwmon/hwmon9/pwm1: No such file or directory
/sys/bus/pci/devices/0000:93:00.0/hwmon/hwmon10/pwm1: No such file or directory
OpenCL API (OpenCL 2.1 AMD-APP (3602.0)) - Platform #1 [Advanced Micro Devices, Inc.]
=====================================================================================
* Device #1: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #2: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #3: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #4: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #5: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #6: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #7: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #8: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
Benchmark relevant options:
===========================
* --workload-profile=4
-----------------------
* Hash-Mode 1000 (NTLM)
-----------------------
Speed.#1.........: 9333.3 MH/s (93.23ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#2.........: 9431.5 MH/s (92.30ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#3.........: 9415.0 MH/s (92.50ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#4.........: 9549.7 MH/s (91.21ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#5.........: 9576.7 MH/s (90.92ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#6.........: 9520.6 MH/s (91.46ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#7.........: 9513.1 MH/s (91.52ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#8.........: 9377.8 MH/s (92.81ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#*.........: 75717.6 MH/s
Started: Tue Apr 2 09:28:28 2024
Stopped: Tue Apr 2 09:28:38 2024