Temperature control for multiple sessions
#1
I have had some scenarios with hashcat that lead to the creation of this script.  It is bash for use with an Nvidia GPU in Linux.  I hope you guys get some use out of it.

*Edit*

I am troubleshooting an issue that might be exclusive to hashcat 3.30.  I will test with hashcat 3.2 to see if that version also has the same issue.  I am running a single GTX 1080 FE with two different hashcat sessions.  The behavior that should happen is that the remaining session continues to run when the first one stops, but when one session stops on a checkpoint, I get the following error in the other session:

X Error of failed request:  BadValue (integer parameter out of range for operation)
 Major opcode of failed request:  157 (NV-CONTROL)
 Minor opcode of failed request:  3 ()
 Value in failed request:  0x17
 Serial number of failed request:  32
 Current serial number in output stream:  33

Then it crashes.

Perhaps some of you can test a similar scenario and see if you get a similar error.

So no joy on this script just yet.

*Another edit*

I am using hashcat 3.30 and the nvidia-370 driver and the above problem has gone away.  It seems intermittent.  Querying the GPUCoreTemp every 10 seconds may have been too frequent.  It has been adjusted to 15 seconds.  Testing is ongoing.  I will continue testing on or after January 23rd, 2017.

Please report any issues with this script to this thread and I will investigate them.

Until next time.

Code:
#!/bin/bash

#This is for use with an Nvidia GPU in Linux.
#This script will keep the fan running on a single gpu in the event that you have more than one hashcat process running and one hashcat process is terminated.
#This script assumes that you already have the fan speed running to your liking during an existing hashcat session.
#KEEP THIS IN MIND BEFORE YOU RUN IT.  It will not set your fan speeds prior to you running hashcat for the first time.
#This script assumes that GPUFanControlState=1 and GPUPowerMizerMode=1 when executed. Fan control must already be in a manually controllable state.

#In cases where the fan speed has been set manually and --gpu-temp-retain has not been used to maintain the fan speeds, this has utility.

#Some scenarios where this might be useful are:
#1. --gpu-temp-retain has not been used.
#2. There was more than one hashcat session running, but one has terminated for some reason and has taken the fan speed down with it.
#3. You have old sessions (.restore files) that were not using --gpu-temp-retain and want to run more than one of these sessions at a time.  This could be from converting from water cooling to air cooling.  Manual fan control would be necessary as a result.

#This script will keep the GPU below 61 C by resetting the fan speed to 80 percent.
#This script was designed for use with fast hashes using the GTX 1080 FE.  80 percent fan is usually enough to keep the temperature below 61 C.
#This script uses nvidia-settings to get the temperature reading on the GPU (gpu:0).
#There is a 200 Mhz overclock in the keep_fan_on function.  This was put in for the GTX 1080 FE, but can be adjusted or removed as you see fit.
#You can adjust the other values as you see fit.

#An alternate way to get the value for gpu0temp is:
#gpu0temp=$(nvidia-settings -q GPUCoreTemp --ctrl-display=:0 | grep 'Attribute' | grep 'gpu' | awk -F':' '{print $4}' | awk -F'.' '{print $1}' | awk -F ' ' '{print $1}')

#debug on
set -x

#Declare GPU temp integer
declare -i gpu0temp

#Declare functions for fan control
function keep_fan_on {
nvidia-settings -a GPUFanControlState=1 --ctrl-display=:0
nvidia-settings -a GPUTargetFanSpeed=80 --ctrl-display=:0
nvidia-settings -a GPUPowerMizerMode=1 --ctrl-display=:0
nvidia-settings -a GPUGraphicsClockOffset[3]=200 --ctrl-display=:0
}

function turn_fan_off {

#Reset fans when process ends
nvidia-settings -a GPUGraphicsClockOffset[3]=0 --ctrl-display=:0
nvidia-settings -a GPUPowerMizerMode=0 --ctrl-display=:0
nvidia-settings -a GPUFanControlState=0 --ctrl-display=:0
}

while :
do
        #Watch for running hashcat process
        pidcheck=$(ps -e | grep 'hashcat')
        #Get temp from single gpu
        gpu0temp=$(nvidia-settings --query [screen:0]/GPUCoreTemp --ctrl-display=:0 | grep 'Attribute' | awk -F':' '{print $3}' | awk -F'.' '{print $1}' | awk -F ' ' '{print $1}')
        date
        echo "Watching hashcat process(es)..."
        echo "Press Ctrl-C to stop this monitoring script."

        #As long as one hashcat process is detected, then keep the gpu fan running to keep the temperature down
        if [[ $pidcheck != "" ]]
        then
                if (( $gpu0temp > 60 ))
                then
                        #Keep fan at 80 percent if the conditions are met
                        #This may execute more than once at an interval of 15 seconds until the temperature is below 61 C.
                        echo "Temperature has risen above 60 C"
                        echo "Restoring fan speed to 80 percent."
                        #Set GraphicsClockOffset to 0 to avoid excessive overclocking if this executes more than once
                        nvidia-settings -a GPUGraphicsClockOffset[3]=0 --ctrl-display=:0
                        sleep 1
                        keep_fan_on
                fi
        else
                echo "No hashcat process(es) are detected."
                echo "Resetting fans to normal speed."
                turn_fan_off
                #All done, breaking out of while loop
                break
        fi
sleep 15
clear
done
echo "This monitoring script has terminated because hashcat no longer has any processes/sessions running."
echo "Fan speeds are back to normal."


Messages In This Thread
Temperature control for multiple sessions - by devilsadvocate - 01-18-2017, 06:54 AM