sapb, which card (est)
#1
I am doing some tests using sap codvn b (bcode) (-m 7700) and using below cards (test in w7-x64, 1 hash loaded):
Device #1: Tahiti, 3072MB, 1050Mhz, 32MCU (=HD 7970 vapor)
Device #2: Hawaii, 3072MB, 1040Mhz, 44MCU (=290X)
Speed.GPU.#1...: 236.2 kH/s
Speed.GPU.#2...: 322.8 kH/s

Device #1: GeForce GTX 660 Ti, 2048MB, 1137Mhz, 7MCU
Device #2: GeForce GTX 260, 896MB, 1242Mhz, 27MCU
Speed.GPU.#1...: 60328 H/s
Speed.GPU.#2...: 25616 H/s

Now I've added an Asus HD 6990 which only does:

Device #1: Cayman, 2048MB, 830Mhz, 24MCU
Device #2: Cayman, 2048MB, 830Mhz, 24MCU
Speed.GPU.#1...: 37173 H/s
Speed.GPU.#2...: 37182 H/s

Looking at http://golubev.com/gpuest.htm I would have expected that the HD 6990 would be close to 290X however it is lagging behind. Probably something to do with algorithm because MD5 is doing:

Device #1: Cayman, 2048MB, 830Mhz, 24MCU
Device #2: Cayman, 2048MB, 830Mhz, 24MCU
Device #3: Hawaii, 3072MB, 1040Mhz, 44MCU
Speed.GPU.#1.: 5264.3 MH/s
Speed.GPU.#2.: 5267.3 MH/s
Speed.GPU.#3.: 12174.7 MH/s

What would be the best method to estimate the performance on sap codvn b algorithm to try yet another card (2nd hand)?
#2
Try to do:
Code:
oclHashcat64.exe -b -m 7700

as with old stock clock HD5970 I'm getting:
Code:
Device #1: Cypress, 1024MB, 725Mhz, 20MCU
Device #2: Cypress, 1024MB, 725Mhz, 20MCU

Hashtype: SAP CODVN B (BCODE)
Workload: 1024 loops, 64 accel

Speed.GPU.#1.: 94908.5 kH/s
Speed.GPU.#2.: 75939.3 kH/s
Speed.GPU.#*.:   170.8 MH/s
#3
If I do a benchmark instead of attacking one hash (bruteforce) , I see the following results:

oclHashcat64.exe -b -m 7700
oclHashcat v1.31 starting in benchmark-mode...

Device #1: Cayman, 2048MB, 830Mhz, 24MCU
Device #2: Cayman, 2048MB, 830Mhz, 24MCU
Device #3: Tahiti, 3072MB, 830Mhz, 32MCU
Device #4: Hawaii, 3072MB, 830Mhz, 44MCU

Hashtype: SAP CODVN B (BCODE)
Workload: 1024 loops, 64 accel

Speed.GPU.#1.: 118.7 MH/s
Speed.GPU.#2.: 118.7 MH/s
Speed.GPU.#3.: 725.3 MH/s
Speed.GPU.#4.: 985.7 MH/s
Speed.GPU.#*.: 1948.4 MH/s

cudaHashcat64.exe -b -m 7700
cudaHashcat v1.31 starting in benchmark-mode...

Device #1: GeForce GTX 660 Ti, 2048MB, 1137Mhz, 7MCU

Hashtype: SAP CODVN B (BCODE)
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 293.6 MH/s

Also with benchmark, HD 6990 is lagging far behind the 290X. Even my old GTX660 Ti is faster?
HD6990 is somewhat faster than your benchmarked HD5970 which is in line with the estimates from http://golubev.com/gpuest.htm

Still, I don't understand the big difference between HD6990 and 290X?
#4
You're comparing VLIW4 to GCN, that's the big difference. I'm not familiar with the SAP B algorithm, but I think the obvious answer here is that it is better suited toward GCN than VLIW.
#5
It looks like GCN is indeed better for sap B algorithm.

Thanks to KT819GM, I conclude that VLIW5 will not outperforme GCN as well. So for the ATI range, it looks like sap B performance is best on GCN architecture

Would there be a Nvidia Card with outstanding performance for sap B? So my question remains,
What would be the best method to estimate the performance on sap codvn b algorithm?
#6
GTX 980:

Speed.GPU.#1.: 662.0 MH/s