in OpenCL/m20710_a0-pure.cl I don't see that ctx.w0[0] is modified directly etc...
why don't you just start with 2 very easy blocks of init/update(s)/final for the first SHA512 hash and init/update(s)/final for the second SHA512 hash.
you could just use
first rule of optimization/implementation: do not optimize! just make it work with simple/easy code... optimizations come later
why don't you just start with 2 very easy blocks of init/update(s)/final for the first SHA512 hash and init/update(s)/final for the second SHA512 hash.
you could just use
Code:
sha512_init (&ctx2);
u32 second[32];
second[ 0] = ...
...
second[31] = ... // 32 * 4 = 128
sha512_update (&ctx2, second, 128);
sha512_update (&ctx2, salt, salt_len); // or sha512_update_swap () ?
sha512_final (&ctx2);
first rule of optimization/implementation: do not optimize! just make it work with simple/easy code... optimizations come later