secp256k1 performance issue
I was looking for a bitcoin brainwallet cracker tool (like Brainflayer) that is utilizing GPU instead of CPU.
I came across Hashcat, but unfortunately it didn’t have this algorithm implemented. But luckily it had all the parts needed to calculate Bitcoin public key.

The algorithm involves calculating SHA-256, RIPEMD-160 hashes and calculating SECP256K1 public key using predefined G point.

So I decided to give it a try and implement a module on my own.
First, I decided to test SECP256K1 performance.
I copied SHA256 (1400) a0_pure kernel, added `point_mull()` call in the loop and ran hashcat (straight mode, dictionary + 30k rules, single hash).
To my disappointment, it only performed 170 KH/s (in -w 3 mode).

My hardware: CPU i3 8100, GPU GTX 1060 3gb.
For reference, Brainflayer runs at 130 KH/s on a single core and 440 KH/s combined running in parallel (4 instances) on a CPU.

Hashcat runs at 430 MH/s if I comment out the `point_mull` call.

I tried playing with different kernel settings (accel, loops, threads), and I managed to get 210 KH/s at most.

Here’s my code (see 2 last commits):
As I said, I basically copied SHA256 a0_pure kernel and added bare minimum code to make the `point_mull` call.

Is this expected performance of secp256k1 library? If not, what am I missing?

Thanks in advance.

Attached Files
.png   sha256.png (Size: 15.52 KB / Downloads: 10)
.png   sha256+secp256k1.png (Size: 16.34 KB / Downloads: 11)