R9 290X Overheating
#1
Question 
Hi,
I am running oclhashcat v1.30 with MSI R9-290x Gaming (Twin Frozer) GPU on Windows 7 64 bit.

AMD Catalyst Driver:
Driver Packaging Version: 14.20.1004-140811a-174673E
Catalyst Version: 14.7

CPU Case: ThermalTake Overseer RX-I Full Tower Gaming Case
CPU Fans: Front (intake) : 200 x 200 x 30 mm fan (600rpm, 13dBA) Rear (exhaust) : 120 x 120 x 25 mm Turbo Fan (1000rpm,16dBA) Top (exhaust) : 200 x 200 x 30 mm fan (600rpm, 13dBA) 200 x 200 x 30 mm fan

Before starting oclhashcat I start MSI Afterburner app and set the fan to 100%, I can see/hear that the fans are running at full speed.

As soon as I start oclhashcat I see the GPU temp rises from somewhere in early 50's to 90 in less than five minutes and then the process is aborted.

I have tried few options like:
1) -gpu-temp-retain 80 but the temperature passes over 80.
2) Manually set workload profile from values 1 through 3 and it only makes a difference of a minute or so.
3) I tried -n and -u with lower options and again it just takes few more minutes to reach 90.

This is a brand new card and looks like it has sufficient cooling. I do not experience this issue while playing games or running other benchmark stress tests using Furmark.

I am using the following commands
Code:
oclHashcat64.exe -m 2500  -w 3 -r rules/best64.rule capture.hccap rockyou.txt

oclHashcat64.exe -m 2500 -a3 capture.hccap ?d?d?d?d?d?d?d?d

Note: I see the following warning:
"WARN: Failed to get ADL Target Tempature Data"

For WPA, I am getting about 195-200 kH/s on an average but since the temperature rises within few minutes I have to either pause the session or use the session restore option to save progress and then continue when the temperature falls down.

I read multiple threads and the solutions either suggested working around with -n and -u values or using the gpu-temp-retain option and I have already tried all of these.

Anyone else running R9 290x have similar overheating issues ?
Any advise is appreciated.

Update: Attached hardware monitoring status file.


Attached Files
.txt   hardware_monitoring.txt (Size: 64.55 KB / Downloads: 12)
#2
(09-29-2014, 03:05 AM)ciphercodes Wrote: This is a brand new card and looks like it has sufficient cooling. I do not experience this issue while playing games or running other benchmark stress tests using Furmark.

We say over and over again not to buy OEM design GPUs, yet people still do it.

No, your card does not have sufficient cooling for compute. Stress testing with Furmark is nothing compared to cracking ALU-bound hash algorithms. The cooler on your card is designed for gaming workloads, and will have a hard time coping with compute workloads.

That said, you should not be overheating that quickly. Sounds like you have a defective GPU.
#3
Thanks for the prompt reply epixoip.
Can you suggest any other tests before I get RMA for this card ?

I have attached extract from hardware_monitor logs to the original post (just in case if it helps) which shows how quickly the temperature rises.
#4
hm, looking at your log it doesn't seem that your fans are actually at 100% before hashcat starts. fan speed is at 15% until the temp hits 80C, then doesn't hit 100% until 86C. by that time, the cooling solution on your card has no chance.

manually set the fan speed to 100% again, and try running oclHashcat with just --powertune-disable, and without switches like --gpu-temp-retain
#5
Thanks again!
I am currently running oclHashcat with just --powertune-disable and I see that this time the temperature is not going over 86. It's been running like this for over 20 minutes and temperature fluctuates between 85-86. I noticed that hash rate has gone down to 172-173 kH/s but I am not worried about it at all.

Code:
Session.Name...: oclHashcat
Status.........: Running
Rules.Type.....: File (rules/best64.rule)
Input.Mode.....: File (rockyou.txt)
Hash.Target....: XXXXX (xx:xx:xx:xx:xx:xx <-> xx:xx:xx:xx:xx:xx)
Hash.Type......: WPA/WPA2
Time.Started...: Sun Sep 28 22:15:32 2014 (20 mins, 22 secs)
Time.Estimated.: Sun Sep 28 23:48:29 2014 (1 hour, 12 mins)
Speed.GPU.#1...:   172.2 kH/s
Recovered......: 0/1 (0.00%) Digests, 0/1 (0.00%) Salts
Progress.......: 369766107/1118777088 (33.05%)
Skipped........: 0/369766107 (0.00%)
Rejected.......: 158566107/369766107 (42.88%)
HWMon.GPU.#1...:  0% Util, 86c Temp, 100% Fan

I will continue to monitor and see if it continues to retain these temperature levels.

Looks like the gpu is not defective, it's just that it's not recommended for oclHashcat ? Please correct me if I am wrong.
#6
The hash rate has gone down because the clock is throttling due to the heat.

It seems that the problem is that the 290X is supposed to be a Powertune 2.0 GPU, but a lot of OEMs are not fully implementing Powertune 2.0 on their cards. This is why you see the warning that ADL can't get the target temperature, which is a Powertune 2.0 feature, and why other Overdrive & Powertune features do not behave correctly.

So yeah, it's just a bad GPU to use with oclHashcat.
#7
Thank you for all your help epixoip.

The session just finished as it was exhausted and it ran for about 1 hour and 5 minutes. It could have run longer if not exhausted.

Code:
Session.Name...: oclHashcat
Status.........: Exhausted
Rules.Type.....: File (rules/best64.rule)
Input.Mode.....: File (rockyou.txt)
Hash.Target....: XXXXX (xx:xx:xx:xx:xx:xx <-> xx:xx:xx:xx:xx:xx)
Hash.Type......: WPA/WPA2
Time.Started...: Sun Sep 28 22:15:32 2014 (1 hour, 5 mins)
Time.Estimated.: 0 secs
Speed.GPU.#1...:   170.7 kH/s
Recovered......: 0/1 (0.00%) Digests, 0/1 (0.00%) Salts
Progress.......: 1118777088/1118777088 (100.00%)
Skipped........: 0/1118777088 (0.00%)
Rejected.......: 442208416/1118777088 (39.53%)
HWMon.GPU.#1...: 40% Util, 85c Temp, 100% Fan

Started: Sun Sep 28 22:15:32 2014
Stopped: Sun Sep 28 23:20:58 2014
#8
Sorry to revive this old thread but I am having same overheating issue with oclhashcat 2.01.
With oclhashcat 1.30, the issue was resolved by disabling powertune. I know that powertune is disabled by default in the most recent versions but oclhashcat 2.01 runs for less than two minutes and stops because of overheating.
I tried using different workload profiles but oclhashcat temperature reaches 90 within a couple of minutes and stops.
Oclhashcat 1.30 does not work anymore, please advise.