AMD MI250 Instinct
#1
Hi All, wondering if anyone has been able to successfully run Hashcat using the HIP API on AMD MI250 GPU's.  I can get Hashcat to run over opencl, but performance seems to be a bit slow.  When using HIP I get compile errors when attempting to run.  Currently running on RHEL 8.9 for operating system using the latest version of the amdgpu-install also running Hashcat version 6.2.6 as well as building from source I end up with the same error messages.  Any thoughts on what I'm getting for OpenCL performance?  Could I expect to see better performance if I were to get HIP working?  Thanks in advance!

This is the approach I'm using for the AMD rocm driver install
# amdgpu-install --usecase=rocm,hiplibsdk,opencl

Here is the full output of benchmark:

# ./hashcat -b -w4 -m1000
hashcat (v6.2.6-850-gfafb277e0) starting in benchmark mode

The device #9 specifically listed was skipped because it is an alias of device #1
The device #10 specifically listed was skipped because it is an alias of device #2
The device #11 specifically listed was skipped because it is an alias of device #3
The device #12 specifically listed was skipped because it is an alias of device #4
The device #13 specifically listed was skipped because it is an alias of device #5
The device #14 specifically listed was skipped because it is an alias of device #6
The device #15 specifically listed was skipped because it is an alias of device #7
The device #16 specifically listed was skipped because it is an alias of device #8

/sys/bus/pci/devices/0000:31:00.0/hwmon/hwmon3/pwm1: No such file or directory
/sys/bus/pci/devices/0000:34:00.0/hwmon/hwmon4/pwm1: No such file or directory
/sys/bus/pci/devices/0000:11:00.0/hwmon/hwmon5/pwm1: No such file or directory
/sys/bus/pci/devices/0000:14:00.0/hwmon/hwmon6/pwm1: No such file or directory
/sys/bus/pci/devices/0000:ae:00.0/hwmon/hwmon7/pwm1: No such file or directory
/sys/bus/pci/devices/0000:b3:00.0/hwmon/hwmon8/pwm1: No such file or directory
/sys/bus/pci/devices/0000:8e:00.0/hwmon/hwmon9/pwm1: No such file or directory
/sys/bus/pci/devices/0000:93:00.0/hwmon/hwmon10/pwm1: No such file or directory

HIP API (HIP 6.0.32831)
=======================
* Device #1: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #2: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #3: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #4: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #5: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #6: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #7: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU
* Device #8: AMD Instinct MI250X/MI250, 65446/65520 MB, 104MCU

OpenCL API (OpenCL 2.1 AMD-APP (3602.0)) - Platform #1 [Advanced Micro Devices, Inc.]
=====================================================================================
* Device #9: AMD Instinct MI250X/MI250, skipped
* Device #10: AMD Instinct MI250X/MI250, skipped
* Device #11: AMD Instinct MI250X/MI250, skipped
* Device #12: AMD Instinct MI250X/MI250, skipped
* Device #13: AMD Instinct MI250X/MI250, skipped
* Device #14: AMD Instinct MI250X/MI250, skipped
* Device #15: AMD Instinct MI250X/MI250, skipped
* Device #16: AMD Instinct MI250X/MI250, skipped

Benchmark relevant options:
===========================
* --backend-devices-virtual=1
* --workload-profile=4

-----------------------
* Hash-Mode 1000 (NTLM)
-----------------------

hiprtcCompileProgram(): HIPRTC_ERROR_COMPILATION

lld: error: undefined hidden symbol: __ockl_get_group_id
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.oSadgpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.oSadgpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.oSadgpu_memset)
>>> referenced 7 more times

lld: error: undefined hidden symbol: __ockl_get_local_size
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.oSadgpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.oSadgpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.oSadgpu_memset)
>>> referenced 7 more times

lld: error: undefined hidden symbol: __ockl_get_local_id
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.oSadgpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.oSadgpu_decompress)
>>> referenced by /root/hashcat/comgr-2e6d13/input/LLVMBitcode.bc.oSadgpu_memset)
>>> referenced 7 more times

* Device #1: Kernel /root/hashcat/OpenCL/shared.cl build failed.

* Device #1: Kernel /root/hashcat/OpenCL/shared.cl build failed.

Started: Tue Apr  2 16:25:25 2024
Stopped: Tue Apr  2 16:25:27 2024

If I install without hip and just use opencl this is the performance I'm seeing.  
# amdgpu-install --usecase=opencl

[root@gpu002 1]# # /scratch/hashtopolis/crackers/1/hashcat.bin -b -w4 -m1000
hashcat (v6.2.6) starting in benchmark mode

/sys/bus/pci/devices/0000:31:00.0/hwmon/hwmon3/pwm1: No such file or directory
/sys/bus/pci/devices/0000:34:00.0/hwmon/hwmon4/pwm1: No such file or directory
/sys/bus/pci/devices/0000:11:00.0/hwmon/hwmon5/pwm1: No such file or directory
/sys/bus/pci/devices/0000:14:00.0/hwmon/hwmon6/pwm1: No such file or directory
/sys/bus/pci/devices/0000:ae:00.0/hwmon/hwmon7/pwm1: No such file or directory
/sys/bus/pci/devices/0000:b3:00.0/hwmon/hwmon8/pwm1: No such file or directory
/sys/bus/pci/devices/0000:8e:00.0/hwmon/hwmon9/pwm1: No such file or directory
/sys/bus/pci/devices/0000:93:00.0/hwmon/hwmon10/pwm1: No such file or directory

OpenCL API (OpenCL 2.1 AMD-APP (3602.0)) - Platform #1 [Advanced Micro Devices, Inc.]
=====================================================================================
* Device #1: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #2: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #3: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #4: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #5: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #6: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #7: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU
* Device #8: AMD Instinct MI250X/MI250, 65408/65520 MB (55692 MB allocatable), 104MCU

Benchmark relevant options:
===========================
* --workload-profile=4

-----------------------
* Hash-Mode 1000 (NTLM)
-----------------------

Speed.#1.........:  9333.3 MH/s (93.23ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#2.........:  9431.5 MH/s (92.30ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#3.........:  9415.0 MH/s (92.50ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#4.........:  9549.7 MH/s (91.21ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#5.........:  9576.7 MH/s (90.92ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#6.........:  9520.6 MH/s (91.46ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#7.........:  9513.1 MH/s (91.52ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#8.........:  9377.8 MH/s (92.81ms) @ Accel:128 Loops:1024 Thr:64 Vec:1
Speed.#*.........: 75717.6 MH/s

Started: Tue Apr  2 09:28:28 2024
Stopped: Tue Apr  2 09:28:38 2024
Reply