Table of Contents

oclHashcat clustering with VirtualCL

NOTE: This technique is legacy and no longer supported

VirtualCL (VCL) is a cluster platform that allows OpenCL applications to transparently utilize many OpenCL devices in a cluster, as if all the devices were on the local computer. We can leverage this virtualization technology to create GPU clusters for use with Hashcat.

Those new to VCL may want to watch epixoip's Passwords^12 presentation on VCL clustering with Hashcat.

Disclaimer

This guide is intended to act as a primer for basic installation and use of VCL. It is does not exhaustively cover all aspects of running a VCL cluster. Perhaps most importantly, it does not cover how to secure your VCL cluster.

Following this guide step-by-step has the very real potential of resulting in a highly insecure configuration. However, as cluster environments tend to be quite diverse, we cannot possibly cover all aspects of securing your cluster. Therefore, securing the cluster is outside the scope of this article and is left as an exercise for the reader. We trust (ha!) that you are experienced enough to perform your own risk assessment, and apply apt controls where appropriate.

THIS GUIDE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE. WE WILL NOT ACCEPT ANY LIABILITY FOR ANY SECURITY RISKS YOU INTRODUCE TO YOUR ENVIRONMENT UPON FOLLOWING THIS GUIDE.

At a minimum, you should ensure that all commands are executed in a trusted – and preferably isolated – environment.

Software requirements

Hardware requirements

Plan your LAN!

At a minimum you will need two computers: a broker node, and a compute node.

The broker node is essentially the cluster master – it is the system that controls all the other nodes in the cluster. It does not need to have any GPU devices installed, and it should not have the AMD Catalyst driver installed. This node can be just about anything – even a virtual machine – as long as it has at least 2GB of RAM per GPU in your cluster. This is the only node that you will interact with directly.

Compute nodes are essentially the cluster slaves – they are the back-end systems that execute your OpenCL kernels. They need to have at least one GPU, and may have as many as 8 GPUs. They will need to have all the normal stuff: X11, the AMD Catalyst driver, etc. You will not interact with these nodes directly, as they will be controlled completely from the broker node.

All nodes will need to be on a private, dedicated, high-speed network. It is highly recommended that your VCL traffic be on its own physically isolated network. If this is not possible, you should at least ensure your VCL nodes are on their own subnet and their own VLAN. Ideally, there should be no routers involved in your cluster. While VCL traffic is routable, routing will have a severe impact on performance. Thus, nodes should either be directly connected, or networked via a local switch only.

Your network should use Gigabit Ethernet at a bare minimum. Your compute nodes will use approximately 90 Mbit per GPU, and may occasionally spike to over 100 Mbit. However, as Ethernet latencies will negatively impact performance, it is highly recommended you use a minimum of 4x SDR Infiniband instead.

Install VCL v1.22

Each node in the cluster needs to have VCL installed. Therefore, the following instructions must be performed on every node.

wget http://www.mosix.org/vcl/VCL-1.22.tbz
tar xf VCL-1.22.tbz
cd vcl-1.22/
 
mkdir /usr/lib/vcl /etc/vcl
install -m 0755 -o root opencld /sbin/opencld
install -m 0755 -o root opencld_1.2 /sbin/opencld_1.2
install -m 0755 -o root broker /sbin/broker
install -m 0755 -o root libOpenCL.so /usr/lib/vcl/libOpenCL.so
install -m 0644 -o root man/man7/vcl.7 /usr/share/man/man7/vcl.7
 
ln -s libOpenCL.so /usr/lib/vcl/libOpenCL.so.1
ln -s libOpenCL.so.1 /usr/lib/libOpenCL.so

The init script that ships with VCL was written explicitly for openSUSE/MOSIX, and is incompatible with other distributions like Ubuntu.

https://web.archive.org/web/20140607162252/http://bindshell.nl/pub/vcl.init
install -m 0755 -o root vcl.init /etc/init.d/vcl
update-rc.d vcl defaults

Configure the Compute Nodes

Run the following commands on each compute node in the cluster. Do not execute any of these commands on the broker node unless you know what you are doing!

touch /etc/vcl/is_back_end
touch /etc/vcl/amd-1.2
xhost +

PROTIP: Put this in your window manager's autostart file so that it is run automatically on boot.

/etc/init.d/vcl start

Configure the Broker Node

Run the following commands on the broker node only.

touch /etc/vcl/is_host
rm -f /etc/vcl/nodes
echo 192.168.10.2 >> /etc/vcl/nodes
echo 192.168.10.3 >> /etc/vcl/nodes

NOTE: Replace the 192.168.10.x nodes with the actual IP addresses of your compute nodes…

/etc/init.d/vcl start

Install Hashcat

Install oclHashcat-plus or oclHashcat-lite as normal on the broker node. Do not install Hashcat on the compute nodes.

Run Hashcat

Running Hashcat under VCL is primarily the same as running it without VCL, with a few small differences.

export LD_LIBRARY_PATH=/usr/lib/vcl

PROTIP: Add this line to your .bashrc or update your ld.so.conf instead

Now just run Hashcat like normal!

Example:

root@token:~/oclHashcat-plus-0.15# export LD_LIBRARY_PATH=/usr/lib/vcl
 
root@token:~/oclHashcat-plus-0.15# ./oclHashcat-plus64.bin -t 32 -a 7 example0.hash ?a?a?a?a example.dict
 
oclHashcat-plus v0.15 by atom starting...
 
Hashes: 6494 total, 1 unique salts, 6494 unique digests
Bitmaps: 16 bits, 65536 entries, 0x0000ffff mask, 262144 bytes
Workload: 256 loops, 80 accel
Watchdog: Temperature abort trigger disabled
Watchdog: Temperature retain trigger disabled
Device #1: Cayman, 1024MB, 800Mhz, 24MCU
Device #2: Cypress, 512MB, 800Mhz, 20MCU
Device #3: Cayman, 1024MB, 800Mhz, 24MCU
Device #4: Cypress, 512MB, 800Mhz, 20MCU
Device #5: Cayman, 1024MB, 800Mhz, 24MCU
Device #6: Cypress, 512MB, 800Mhz, 20MCU
Device #7: Cayman, 1024MB, 800Mhz, 24MCU
Device #8: Cypress, 512MB, 800Mhz, 20MCU
Device #9: Cayman, 1024MB, 800Mhz, 24MCU
Device #10: Cypress, 512MB, 800Mhz, 20MCU
Device #11: Cayman, 1024MB, 800Mhz, 24MCU
Device #1: Kernel ./kernels/4098/m0000_a1.Cayman_938.2_CAL 1.4.1741.kernel (490316 bytes)
Device #2: Kernel ./kernels/4098/m0000_a1.Cypress_938.2_CAL 1.4.1741.kernel not found in cache! Building may take a while...
Device #2: Kernel ./kernels/4098/m0000_a1.Cypress_938.2_CAL 1.4.1741.kernel (456788 bytes)
Device #3: Kernel ./kernels/4098/m0000_a1.Cayman_938.2_CAL 1.4.1741.kernel (490316 bytes)
Device #4: Kernel ./kernels/4098/m0000_a1.Cypress_938.2_CAL 1.4.1741.kernel (456788 bytes)
Device #5: Kernel ./kernels/4098/m0000_a1.Cayman_938.2_CAL 1.4.1741.kernel (490316 bytes)
Device #6: Kernel ./kernels/4098/m0000_a1.Cypress_938.2_CAL 1.4.1741.kernel (456788 bytes)
Device #7: Kernel ./kernels/4098/m0000_a1.Cayman_938.2_CAL 1.4.1741.kernel (490316 bytes)
Device #8: Kernel ./kernels/4098/m0000_a1.Cypress_938.2_CAL 1.4.1741.kernel (456788 bytes)
Device #9: Kernel ./kernels/4098/m0000_a1.Cayman_938.2_CAL 1.4.1741.kernel (490316 bytes)
Device #10: Kernel ./kernels/4098/m0000_a1.Cypress_938.2_CAL 1.4.1741.kernel (456788 bytes)
Device #11: Kernel ./kernels/4098/m0000_a1.Cayman_938.2_CAL 1.4.1741.kernel (490316 bytes)
 
Cache-hit dictionary stats example.dict: 1210228 bytes, 129988 words, 129988 keyspace
 
e973fc5ba3f41964b28cbf1d3d3e7f5c:dyn10001000
7ac74b85753f5cee6c9446b103cc59e8:scas0000
aa1ce869d58da099b8c15859540ad220:linn01111986
0142b84c7d5ab92691cf9e21fbca9a08:f8140123
5be4a34c048a4d4af2462679b7886151:55bo02020202
d2a92d0b635bdc8fd3df226b7084dc8c:368901man
root@token:~/oclHashcat-lite-0.15# export LD_LIBRARY_PATH=/usr/lib/vcl
 
root@token:~/oclHashcat-lite-0.15# ./oclHashcat-lite64.bin -b --benchmark-mode 1 -m 1000
 
oclHashcat-lite v0.15 by atom starting...
 
Password lengths: 1 - 54
Watchdog: Temperature abort trigger disabled
Watchdog: Temperature retain trigger disabled
Device #1: Cayman, 1024MB, 800Mhz, 24MCU
Device #2: Cypress, 512MB, 800Mhz, 20MCU
Device #3: Cayman, 1024MB, 800Mhz, 24MCU
Device #4: Cypress, 512MB, 800Mhz, 20MCU
Device #5: Cayman, 1024MB, 800Mhz, 24MCU
Device #6: Cypress, 512MB, 800Mhz, 20MCU
Device #7: Cayman, 1024MB, 800Mhz, 24MCU
Device #8: Cypress, 512MB, 800Mhz, 20MCU
Device #9: Cayman, 1024MB, 800Mhz, 24MCU
Device #10: Cypress, 512MB, 800Mhz, 20MCU
Device #11: Cayman, 1024MB, 800Mhz, 24MCU
 
[s]tatus [p]ause [r]esume [q]uit =>
NOTE: Runtime limit reached, aborting...
 
 
Hash.Type....: NTLM
Speed.GPU.#1.:  9804.3M/s
Speed.GPU.#2.:  9400.3M/s
Speed.GPU.#3.:  9790.4M/s
Speed.GPU.#4.:  9381.1M/s
Speed.GPU.#5.:  9793.9M/s
Speed.GPU.#6.:  9381.4M/s
Speed.GPU.#7.:  9785.4M/s
Speed.GPU.#8.:  9381.3M/s
Speed.GPU.#9.:  9783.6M/s
Speed.GPU.#10.:  9936.9M/s
Speed.GPU.#11.:  9784.6M/s
Speed.GPU.#*.:   106.2G/s
 
Started: Fri Jan  4 11:38:37 2013
Stopped: Fri Jan  4 11:38:57 2013

Have Fun!

Troubleshooting if something does not work

The following troubleshooting step must be done only on the master node:

# echo /usr/lib/vcl > /etc/ld.so.conf.d/vcl
# ldconfig

The following troubleshooting step must be done only on the slave nodes:

# which clinfo
/usr/bin/clinfo
# dmesg | egrep 'fglrx.*module loaded'
[  369.158196] [fglrx] module loaded - fglrx 13.20.4 [Jul 26 2013] with 6 minors
# clinfo | grep "  Name:"
  Name:                                          Tahiti
  Name:                                          Tahiti
  Name:                                          Tahiti
  Name:                                          Tahiti
  Name:                                          Tahiti
  Name:                                          Tahiti
  Name:                                          Intel(R) Xeon(R) CPU E5645 @ 2.40GHz