Bitslice status and broken nvcc
cudaHashcat v1.38 starting in benchmark-mode...

Device #1: GeForce GTX TITAN X, 12288MB, 1076Mhz, 24MCU

Hashtype: LM
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 20086.4 MH/s

Started: Sat Oct 31 10:41:22 2015
Stopped: Sat Oct 31 10:41:40 2015

meh, okay
Awesome. So a full BF of ASCII uc keyspace is down to just over 6 minutes.
(10-30-2015, 05:42 PM)atom Wrote: #define mysel(a,b,c) (bitselect (a,b,(c) ? 0xffffffff : 0)) -- Getting 251 MH/s

So there's a drop from 470 MH/s to 251 for dynamic salt support. I'll take it for now as we also get the multihash support with it.

More News!

So by telling the OpenCL runtime to disable the compiler optimization (-cl-opt-disable) the speed increased from 251 MH/s to 355 MH/s, lol. If you guys want to play with it, it's in b22+

About 20GH:

With b22 version and +250Mhz on the TitanX we can get > 20GH on a single card. With 250Mhz on the 980Ti we're somewhere around 18GH.
Quote:DES Speed.GPU.#1...: 37043.9 kH/s

Quote:DES Speed.GPU.#1...:   106.9 MH/s

Works for me.
I get "only" 2.79B p/s for LM on my Titan, which isn't too shabby either.
Sounds good to me, 106.9 MH/s * 25 = 2672,5
So, in this specific example, Titan X is ~7.2 times faster than Titan, both being overclocked.
Serious performance increase over old gen architecture.
OK, so I'm reaching a point where it becomes harder to increase the speed alot. Here's some final values:


-m 1500: 350 MH/s with dynamic salt, 450 MH/s with fixed salt
-m 3000: 12000MH/s (from theoretical max of 12200 MH/s)!!

Titan X (+250 MHz)

-m 1500: 750 MH/s with dynamic salt
-m 3000: 20100 MH/s

I'm thinking about porting this to the other hashes, it depends on your guys request for it. There's 3 more algorithms using DES with a fixed key/data portion to turn it into a hash:

-m 3100 = Oracle H: Type (Oracle 7+)
-m 8500 = RACF
-m 12400 = BSDiCrypt, Extended DES