hashcat Forum
cuInit(): no CUDA-capable device is detected on WSL2 The-Distribution-Which-Does-Not-Handle-OpenCL-Well (Kali) - Printable Version

+- hashcat Forum (https://hashcat.net/forum)
+-- Forum: Support (https://hashcat.net/forum/forum-3.html)
+--- Forum: hashcat (https://hashcat.net/forum/forum-45.html)
+--- Thread: cuInit(): no CUDA-capable device is detected on WSL2 The-Distribution-Which-Does-Not-Handle-OpenCL-Well (Kali) (/thread-11621.html)



cuInit(): no CUDA-capable device is detected on WSL2 The-Distribution-Which-Does-Not-Handle-OpenCL-Well (Kali) - mumphus - 09-24-2023

running device query:
Code:
./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA GeForce RTX 3080 Ti"
  CUDA Driver Version / Runtime Version          12.2 / 12.2
  CUDA Capability Major/Minor version number:    8.6
  Total amount of global memory:                12288 MBytes (12884377600 bytes)
  (80) Multiprocessors, (128) CUDA Cores/MP:    10240 CUDA Cores
  GPU Max Clock rate:                            1665 MHz (1.66 GHz)
  Memory Clock rate:                            9501 Mhz
  Memory Bus Width:                              384-bit
  L2 Cache Size:                                6291456 bytes
  Maximum Texture Dimension Size (x,y,z)        1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:              65536 bytes
  Total amount of shared memory per block:      49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                    32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:          1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                            512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                    Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:      Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:  0 / 1 / 0
  Compute Mode:
    < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.2, CUDA Runtime Version = 12.2, NumDevs = 1, Device0 = NVIDIA GeForce RTX 3080 Ti
Result = PASS


running hashcat -I:
Code:
hashcat -I
hashcat (v6.2.6) starting in backend information mode

cuInit(): no CUDA-capable device is detected

OpenCL Info:
============

OpenCL Platform ID #1
  Vendor..: The pocl project
  Name....: Portable Computing Language
  Version.: OpenCL 3.0 PoCL 4.0+debian  Linux, None+Asserts, RELOC, SPIR, LLVM 15.0.7, SLEEF, DISTRO, POCL_DEBUG

  Backend Device ID #1
    Type...........: CPU
    Vendor.ID......: 128
    Vendor.........: GenuineIntel
    Name...........: cpu-haswell-13th Gen Intel(R) Core(TM) i7-13700KF
    Version........: OpenCL 3.0 PoCL HSTR: cpu-x86_64-pc-linux-gnu-haswell
    Processor(s)...: 24
    Clock..........: 3417
    Memory.Total...: 13851 MB (limited to 2048 MB allocatable in one block)
    Memory.Free....: 6893 MB
    Local.Memory...: 2048 KB
    OpenCL.Version.: OpenCL C 1.2 PoCL
    Driver.Version.: 4.0+debian

Code:
nvidia-smi
Sun Sep 24 13:55:35 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.125.06  Driver Version: 537.42      CUDA Version: 12.2    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|        Memory-Usage | GPU-Util  Compute M. |
|                              |                      |              MIG M. |
|===============================+======================+======================|
|  0  NVIDIA GeForce ...  On  | 00000000:01:00.0  On |                  N/A |
|  0%  59C    P8    66W / 350W |  1011MiB / 12288MiB |      1%      Default |
|                              |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU  GI  CI        PID  Type  Process name                  GPU Memory |
|        ID  ID                                                  Usage      |
|=============================================================================|
|    0  N/A  N/A      100      G  /Xwayland                      N/A      |
+-----------------------------------------------------------------------------+



RE: cuInit(): no CUDA-capable device is detected on WSL2 The-Distribution-Which-Does-N... - MrRaja - 09-25-2023

It seems you're encountering an issue with CUDA on WSL2 with The-Distribution-Which-Does-Not-Handle-OpenCL-Well (Kali) Linux.
To resolve this problem, you can follow these steps:

1. Check Your WSL2 Configuration:
  - Ensure that you have WSL2 properly installed and configured on your Windows machine.
  - Make sure you have the necessary drivers for your GPU installed on your Windows host machine.

2. Update The-Distribution-Which-Does-Not-Handle-OpenCL-Well (Kali) Linux:
  - Open your The-Distribution-Which-Does-Not-Handle-OpenCL-Well (Kali) Linux terminal.
  - Run the following commands to update your The-Distribution-Which-Does-Not-Handle-OpenCL-Well (Kali) Linux distribution:
Code:
sudo apt update
sudo apt upgrade



3. Install NVIDIA Drivers for WSL:
  - Install the NVIDIA drivers for WSL by running the following commands:
Code:
    sudo apt install nvidia-driver


4. Reboot Your System:
  - After installing the drivers, reboot your computer to ensure the changes take effect.

5. Verify CUDA Installation:
  - After rebooting, open your The-Distribution-Which-Does-Not-Handle-OpenCL-Well (Kali) Linux terminal again and verify that CUDA is installed correctly:
Code:
    nvcc --version


6. Check CUDA Device:
  - To check if CUDA-capable devices are detected, you can run:
Code:
    nvidia-smi


  - If no devices are detected, it's possible that your GPU is not compatible with WSL2 or there may be a configuration issue on your Windows host.

7. WSL2 Configuration File:
  - Ensure that the `.wslconfig` file on your Windows host is properly configured to support GPU acceleration. You can create or modify this file in your user directory (`C:\Users\<your_username>\.wslconfig`). An example configuration for GPU support might look like this:
Code:
    [wsl2]
    memory=4GB  # Your preferred memory allocation
    processors=2 # Your preferred CPU allocation


8. Check WSL2 Version:
  - Make sure you are using WSL2. You can check the WSL version by running:
Code:
    wsl --list --verbose


9. WSL2 Kernel Update:
  - Ensure that you are using the latest WSL2 kernel. You can update it from the [Microsoft WSL GitHub repository](https://github.com/microsoft/WSL2-Linux-Kernel).

10. Come back to Forum:
  - If you've followed these steps and are still facing issues, I can't help you anymore. Maybe someone else can.


RE: cuInit(): no CUDA-capable device is detected on WSL2 The-Distribution-Which-Does-N... - Philomoe - 08-17-2024

I met the same question on WSL2 Ubuntu 22.04.

Code:
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "NVIDIA GeForce RTX 4060 Laptop GPU"
  CUDA Driver Version / Runtime Version          12.4 / 12.6
  CUDA Capability Major/Minor version number:    8.9
  Total amount of global memory:                8188 MBytes (8585216000 bytes)
  (024) Multiprocessors, (128) CUDA Cores/MP:    3072 CUDA Cores
  GPU Max Clock rate:                            1890 MHz (1.89 GHz)
  Memory Clock rate:                            8001 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                33554432 bytes
  Maximum Texture Dimension Size (x,y,z)        1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
  Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
  Total amount of constant memory:              65536 bytes
  Total amount of shared memory per block:      49152 bytes
  Total shared memory per multiprocessor:        102400 bytes
  Total number of registers available per block: 65536
  Warp size:                                    32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:          1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                            512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                    Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:      Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device supports Managed Memory:                Yes
  Device supports Compute Preemption:            Yes
  Supports Cooperative Kernel Launch:            Yes
  Supports MultiDevice Co-op Kernel Launch:      No
  Device PCI Domain ID / Bus ID / location ID:  0 / 1 / 0
  Compute Mode:
    < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 12.4, CUDA Runtime Version = 12.6, NumDevs = 1
Result = PASS

Code:
hashcat -I
hashcat (v6.2.5) starting in backend information mode

cuInit(): no CUDA-capable device is detected

OpenCL Info:
============

OpenCL Platform ID #1
  Vendor..: The pocl project
  Name....: Portable Computing Language
  Version.: OpenCL 2.0 pocl 1.8  Linux, None+Asserts, RELOC, LLVM 11.1.0, SLEEF, DISTRO, POCL_DEBUG

  Backend Device ID #1
    Type...........: CPU
    Vendor.ID......: 128
    Vendor.........: GenuineIntel
    Name...........: pthread-13th Gen Intel(R) Core(TM) i9-13900HX
    Version........: OpenCL 1.2 pocl HSTR: pthread-x86_64-pc-linux-gnu-goldmont
    Processor(s)...: 32
    Clock..........: 2419
    Memory.Total...: 13943 MB (limited to 2048 MB allocatable in one block)
    Memory.Free....: 6939 MB
    OpenCL.Version.: OpenCL C 1.2 pocl
    Driver.Version.: 1.8



RE: cuInit(): no CUDA-capable device is detected on WSL2 The-Distribution-Which-Does-N... - penguinkeeper - 08-18-2024

There's no reason to run Hashcat in WSL, it performs just fine on Windows and the extra virtualisation slows it down very significantly and, as you're experiencing, adds a lot of stability problems. Just download the binaries from the top of https://hashcat.net and run it directly in Windows


RE: cuInit(): no CUDA-capable device is detected on WSL2 The-Distribution-Which-Does-N... - Snoopy - 08-20-2024

(08-18-2024, 07:21 PM)penguinkeeper Wrote: There's no reason to run Hashcat in WSL, it performs just fine on Windows and the extra virtualisation slows it down very significantly and, as you're experiencing, adds a lot of stability problems. Just download the binaries from the top of https://hashcat.net and run it directly in Windows

you should use latest version from here, as the latest official release is also quite old

https://hashcat.net/beta/