cudaHashcat-plus 0.09 ERROR: cuStreamSynchronize() 700 / 999
#1
Hi guys,

First of all, thanks atom for the wonderful software, really appreciate your work on hashcat!

I have a GTX690 that I've been trying to use with hashcat for auditing "-m 1800" (sha512crypt) hashes. While the card peaks about 21,000 c/s, it constantly crashes with cuStreamSynchronize() 700 / 999 error messages, after 5-50 mins of running.

I've tried several settings down to "-n 1 --gpu-loops 1", but the crash still happens.

Interestingly, the crash does NOT take place with "-m 500" (md5crypt) hashes, only with "-m 1800". With "-m 500" I can do "-n 256 --gpu-loops 1000" and it works for days without problems at 2100K c/s.


I wonder, is this because my card if buggy, or is something else?

I tried it under Linux (NVIDIA 304.x drivers) and Windows 7 (306.x drivers), exactly same behavior. In Linux it crashes with cuStreamSynchronize() 700, in Windows it alternates between 700 and 999.

Thanks
P
#2
This cuStreamSynchronize() 700 / 999 error is a nightmare. I got several reports for this, all on 6xx cards. Maybe there is a problem with the drivers? I cant reproduce this on mx 560Ti so I cant help on this.
#3
Hi atom!

Thanks for the kind reply.

Interestingly, John the Ripper (compiled for CUDA-X86-64) works fine, but at a slow 4900 c/s. It does not crash, but the speed is bad.

Not sure if this helps, but if you want, I can give you remote shell to one of my computers with a 690, running Linux.

Let me know.

Thanks,
P

(09-22-2012, 11:28 AM)atom Wrote: This cuStreamSynchronize() 700 / 999 error is a nightmare. I got several reports for this, all on 6xx cards. Maybe there is a problem with the drivers? I cant reproduce this on mx 560Ti so I cant help on this.
#4
Hi!

Just in case, I tried today with Ubuntu 10.04 x64 (which other people reported as a working solution for cuStreamSynchronize() 700 errors), it still happens.

I include below a cuda-memcheck error log for cuStreamSynchronize() 700, maybe it helps.

Quote:/usr/local/cuda-5.0/bin/cuda-memcheck ./cudaHashcat-plus64.bin -a 6 -m 1800 -n 32 --gpu-loops 5000 shadow example.dict ?d?d?d
========= CUDA-MEMCHECK
cudaHashcat-plus v0.09 by atom starting...

Hashes: 10 total, 9 unique salts, 9 unique digests
Bitmaps: 8 bits, 256 entries, 0x000000ff mask, 1024 bytes
Workload: 5000 loops, 32 accel
Watchdog: Temperature abort trigger set to 90c
Watchdog: Temperature retain trigger set to 80c
Device #1: GeForce GTX 690, 2047MB, 1019Mhz, 8MCU
Device #2: GeForce GTX 690, 2047MB, 1019Mhz, 8MCU
Device #1: Kernel ./kernels/4318/m1800.sm_30.ptx
Device #2: Kernel ./kernels/4318/m1800.sm_30.ptx

Scanned dictionary example.dict: 1210228 bytes, 129988 words, 129988000 keyspace, starting attack...

[s]tatus [p]ause [r]esume [b]ypass [q]uit =>

ERROR: cuStreamSynchronize() 700

========= Illegal Instruction
========= at 0x00018f88 in m1800_loop
========= by thread (224,0,0) in block (222,0,0)
========= Saved host backtrace up to driver entry point at kernel launch time
========= Host Frame:/usr/lib/libcuda.so.1 (cuLaunchKernel + 0x3ae) [0xc64ae]
========= Host Frame:./cudaHashcat-plus64.bin [0x12776]
========= Host Frame:./cudaHashcat-plus64.bin [0x43d8]
========= Host Frame:./cudaHashcat-plus64.bin [0x6d5f]
========= Host Frame:./cudaHashcat-plus64.bin [0x7f66]
========= Host Frame:./cudaHashcat-plus64.bin [0xd54f]
========= Host Frame:/lib/libc.so.6 (__libc_start_main + 0xfd) [0x1ec4d]
========= Host Frame:./cudaHashcat-plus64.bin [0x2c49]
=========
========= Program hit error 700 on CUDA API call to cuStreamSynchronize
========= Saved host backtrace up to driver entry point at error
========= Host Frame:/usr/lib/libcuda.so.1 (cuStreamSynchronize + 0x1f6) [0xc7cb6]
========= Host Frame:./cudaHashcat-plus64.bin [0x12689]
========= Host Frame:./cudaHashcat-plus64.bin [0x73ef]
========= Host Frame:./cudaHashcat-plus64.bin [0x7f66]
========= Host Frame:./cudaHashcat-plus64.bin [0xd54f]
========= Host Frame:/lib/libc.so.6 (__libc_start_main + 0xfd) [0x1ec4d]
========= Host Frame:./cudaHashcat-plus64.bin [0x2c49]
=========
========= ERROR SUMMARY: 2 errors
#5
Same error on windows:

"cudaHashcat-plus v0.09 by atom starting...

Hashes: 1 total, 1 unique salts, 1 unique digests
Bitmaps: 8 bits, 256 entries, 0x000000ff mask, 1024 bytes
Rules: 1
Workload: 16 loops, 8 accel
Watchdog: Temperature abort trigger set to 90c
Watchdog: Temperature retain trigger set to 80c
Device #1: GeForce GTX 680, 2048MB, 1124Mhz, 8MCU
Device #1: Kernel ./kernels/4318/m2500.sm_30.ptx

Scanned dictionary dico.txt: 107 bytes, 9 words, 9 keyspace, starting attack...

[s]tatus [p]ause [r]esume [b]ypass [q]uit => ERROR: cuStreamSynchronize() 999"
#6
Hi!

Some news from my side.
I was able to trace these errors to the "GPU Boost" function of the NVIDIA 6xx series.

Basically, I noticed that right before the crash, there is a spike in the GPU frequency and voltage.

As you know, the 6XX series include a feature called "GPU Boost". This dynamically changes (overclocks) the GPU clock depending on the workload. In my case, with the 690, it would go from 915MHz to 1019MHz. The jumps also depend on the GPU temperature and cooling, so it generally is a random looking curve.

I tried all sorts of tricks from decreasing the frequency, voltage, RAM freq and so on, but in the end, turning off the GPU boost did the trick. Here's a guide on how to do it:

http://www.overclock.net/t/1267918/guide...dervolting

Since doing this, no more 999/700 errors for me. The card is cracking sha512 hashes at a bit over 21,000 c/s, at a temp of about 78-79 C per GPU.

I recommend you try the same, maybe it works for you as well.
#7
Good Info, thanks!
#8
(10-04-2012, 08:44 AM)paul6990 Wrote: I tried all sorts of tricks from decreasing the frequency, voltage, RAM freq and so on, but in the end, turning off the GPU boost did the trick. Here's a guide on how to do it:

http://www.overclock.net/t/1267918/guide...dervolting

Since doing this, no more 999/700 errors for me. The card is cracking sha512 hashes at a bit over 21,000 c/s, at a temp of about 78-79 C per GPU.

I recommend you try the same, maybe it works for you as well.

I have force the P2 state with default frequency just for testing and this doesn't work.
nvidiainspector.exe -forcepstate:0,2

I have compil other open source cuda md5 cracker with last nvidia SDK and there is no problem so I guess there is something wrong in hashcat code.
#9
Another solution I found (which is maybe easier) is to run something like Asus GPU Tweak:

http://www.asus.com/Graphics_Cards/Features/GPU_Tweak

Reduce clock speed by 10% - should do the trick.
Also, make sure Fan Speed is properly controlled and GPU stays below 80.

Good luck!
#10
(10-04-2012, 05:06 PM)koulikov Wrote: I have compil other open source cuda md5 cracker with last nvidia SDK and there is no problem so I guess there is something wrong in hashcat code.

How can you compare? WPA uses HMAC-SHA1, not MD5.