Creating sha512(sha512($pass).$salt) kernel
#1
Hey everyone, lately I've been trying to make sha512(sha512($pass).$salt) kernel, although I keep failing. I believe it's because I don't really understand u64 in which sha512 is stored.

I've tried doing it in few different ways but none of them seemed to work, I would very appreciate any hint of what I'm doing wrong or what I'm missing. Cheers!

Here is the code taken from the loop

Code:
/**
  * loop
  */

  for (u32 il_pos = 0; il_pos < il_cnt; il_pos++)
  {
    pw_t tmp = PASTE_PW;

    tmp.pw_len = apply_rules (rules_buf[il_pos].cmds, tmp.i, tmp.pw_len);

    sha512_ctx_t ctx0;

    sha512_init (&ctx0);

    sha512_update_swap (&ctx0, tmp.i, tmp.pw_len);

    sha512_final (&ctx0);

    //sha512($pass)

    const u32 ah = h32_from_64_S (ctx0.h[0]);
    const u32 al = l32_from_64_S (ctx0.h[0]);
    const u32 bh = h32_from_64_S (ctx0.h[1]);
    const u32 bl = l32_from_64_S (ctx0.h[1]);
    const u32 ch = h32_from_64_S (ctx0.h[2]);
    const u32 cl = l32_from_64_S (ctx0.h[2]);
    const u32 dh = h32_from_64_S (ctx0.h[3]);
    const u32 dl = l32_from_64_S (ctx0.h[3]);
    const u32 eh = h32_from_64_S (ctx0.h[4]);
    const u32 el = l32_from_64_S (ctx0.h[4]);
    const u32 fh = h32_from_64_S (ctx0.h[5]);
    const u32 fl = l32_from_64_S (ctx0.h[5]);
    const u32 gh = h32_from_64_S (ctx0.h[6]);
    const u32 gl = l32_from_64_S (ctx0.h[6]);
    const u32 hh = h32_from_64_S (ctx0.h[7]);
    const u32 hl = l32_from_64_S (ctx0.h[7]);

    //converting to u32 from u64 in order to convert bin to hex below

    sha512_ctx_t ctx;

    sha512_init (&ctx);

    ctx.w0[0] = uint_to_hex_lower8_le ((ah >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((ah >> 24) & 255) << 16;
    ctx.w0[1] = uint_to_hex_lower8_le ((ah >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((ah >>  8) & 255) << 16;
    ctx.w0[2] = uint_to_hex_lower8_le ((al >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((al >> 24) & 255) << 16;
    ctx.w0[3] = uint_to_hex_lower8_le ((al >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((al >>  8) & 255) << 16;
    ctx.w1[0] = uint_to_hex_lower8_le ((bh >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((bh >> 24) & 255) << 16;
    ctx.w1[1] = uint_to_hex_lower8_le ((bh >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((bh >>  8) & 255) << 16;
    ctx.w1[2] = uint_to_hex_lower8_le ((bl >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((bl >> 24) & 255) << 16;
    ctx.w1[3] = uint_to_hex_lower8_le ((bl >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((bl >>  8) & 255) << 16;
    ctx.w2[0] = uint_to_hex_lower8_le ((ch >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((ch >> 24) & 255) << 16;
    ctx.w2[1] = uint_to_hex_lower8_le ((ch >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((ch >>  8) & 255) << 16;
    ctx.w2[2] = uint_to_hex_lower8_le ((cl >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((cl >> 24) & 255) << 16;
    ctx.w2[3] = uint_to_hex_lower8_le ((cl >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((cl >>  8) & 255) << 16;
    ctx.w3[0] = uint_to_hex_lower8_le ((dh >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((dh >> 24) & 255) << 16;
    ctx.w3[1] = uint_to_hex_lower8_le ((dh >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((dh >>  8) & 255) << 16;
    ctx.w3[2] = uint_to_hex_lower8_le ((dl >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((dl >> 24) & 255) << 16;
    ctx.w3[3] = uint_to_hex_lower8_le ((dl >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((dl >>  8) & 255) << 16;
    ctx.w4[0] = uint_to_hex_lower8_le ((eh >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((eh >> 24) & 255) << 16;
    ctx.w4[1] = uint_to_hex_lower8_le ((eh >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((eh >>  8) & 255) << 16;
    ctx.w4[2] = uint_to_hex_lower8_le ((el >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((el >> 24) & 255) << 16;
    ctx.w4[3] = uint_to_hex_lower8_le ((el >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((el >>  8) & 255) << 16;
    ctx.w5[0] = uint_to_hex_lower8_le ((fh >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((fh >> 24) & 255) << 16;
    ctx.w5[1] = uint_to_hex_lower8_le ((fh >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((fh >>  8) & 255) << 16;
    ctx.w5[2] = uint_to_hex_lower8_le ((fl >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((fl >> 24) & 255) << 16;
    ctx.w5[3] = uint_to_hex_lower8_le ((fl >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((fl >>  8) & 255) << 16;
    ctx.w6[0] = uint_to_hex_lower8_le ((gh >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((gh >> 24) & 255) << 16;
    ctx.w6[1] = uint_to_hex_lower8_le ((gh >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((gh >>  8) & 255) << 16;
    ctx.w6[2] = uint_to_hex_lower8_le ((gl >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((gl >> 24) & 255) << 16;
    ctx.w6[3] = uint_to_hex_lower8_le ((gl >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((gl >>  8) & 255) << 16;
    ctx.w7[0] = uint_to_hex_lower8_le ((hh >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((hh >> 24) & 255) << 16;
    ctx.w7[1] = uint_to_hex_lower8_le ((hh >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((hh >>  8) & 255) << 16;
    ctx.w7[2] = uint_to_hex_lower8_le ((hl >> 16) & 255) <<  0 | uint_to_hex_lower8_le ((hl >> 24) & 255) << 16;
    ctx.w7[3] = uint_to_hex_lower8_le ((hl >>  0) & 255) <<  0 | uint_to_hex_lower8_le ((hl >>  8) & 255) << 16;

    //at this point, from what I understand I should have sha512($pass) in hex

    //and this part - making sha512(sha512($pass)) is where I believe I got something wrong I've tried doing it in three different ways (really just trial and error)

    // first attempt:
    // sha512_transform (ctx.w0, ctx.w1, ctx.w2, ctx.w3, ctx.w4, ctx.w5, ctx.w6, ctx.w7, ctx.h);

    // second attempt:
    // sha512_update_128 (&ctx, w0, w1, w2, w3, w4, w5, w6, w7, 128);

    // third attempt:
    // u64 final[32] = { 0 };
    // final[ 0] = hl32_to_64 (w0[0], w0[1]);
    // final[ 1] = hl32_to_64 (w0[2], w0[3]);
    // final[ 2] = hl32_to_64 (w1[0], w1[1]);
    // final[ 3] = hl32_to_64 (w1[2], w1[3]);
    // final[ 4] = hl32_to_64 (w2[0], w2[1]);
    // final[ 5] = hl32_to_64 (w2[2], w2[3]);
    // final[ 6] = hl32_to_64 (w3[0], w3[1]);
    // final[ 7] = hl32_to_64 (w3[2], w3[3]);
    // final[ 8] = hl32_to_64 (w4[0], w4[1]);
    // final[ 9] = hl32_to_64 (w4[2], w4[3]);
    // final[10] = hl32_to_64 (w5[0], w5[1]);
    // final[11] = hl32_to_64 (w5[2], w5[3]);
    // final[12] = hl32_to_64 (w6[0], w6[1]);
    // final[13] = hl32_to_64 (w6[2], w6[3]);
    // final[14] = hl32_to_64 (w7[0], w7[1]);
    // final[15] = hl32_to_64 (w7[2], w7[3]);

    sha512_update (&ctx, final, 64);

    sha512_update (&ctx, s, salt_len);

    sha512_final (&ctx);

    const u32 r0 = l32_from_64_S (ctx.h[7]);
    const u32 r1 = h32_from_64_S (ctx.h[7]);
    const u32 r2 = l32_from_64_S (ctx.h[3]);
    const u32 r3 = h32_from_64_S (ctx.h[3]);

    COMPARE_M_SCALAR (r0, r1, r2, r3);
  }
Reply
#2
shouldn't the first hash be 128 hex characters (instead of 64) ? Just test with "echo -n hashcat | sha512sum", it is exactly 128 hex chars long.

you would also need to test if the salt needs to be swapped etc
for instance I see sha512_update_global_swap () used in the unmodified -m 1720 kernel.

The COMPARE_M_SCALAR is also suspicous ... why would you test with multiple (2+) hashes first ? use the sxx kernel function for single (1 only) hash and COMPARE_S_SCALAR to compare...

for testing you should also always make sure to remove the cached kernel folder (kernels/).

These are just a few hints, didn't look carefully at the code yet

It would make sense to use printf () for debugging/troubleshooting and maybe an (Intel) CPU with Intel OpenCL Runtime installed (it's probably the easiest to test and debug with printf ()).
Reply
#3
Thank you very much for the answer.

The "sha512_update (&ctx, final, 64);" was supposed to be commented out as it was part of the third attempt. I understand now it was a mistake, the reason for it was me looking at 21000 - sha512(sha512_bin(pass)) kernel. However changing it into 128 didn't fix the kernel.

I didn't use sha512_update_global_swap (), because I didn't notice it being used in simillar kernels like 20710 - sha256(sha256($pass).$salt); just to make sure I've tried changing it, didn't seem to work either

It is COMPARE_M_SCALAR, because I just copied this part, the sxx fuction I got pretty much the same code so didn't thought it would be relevant.

I remember about removing cached kernel.

I will try to play around with printf (), but I gotta admit it is a bit confusing to me, and so is Intel OpenCL Runtime

Just additional info:
- I double checked that I'm running correct hash (verified plaintext) for testing
- I didn't make new kernel file, I'm editing just m01710_a0-pure.cl and running hashcat with --self-test-disable; I did it because I assumed it wouldn't matter, but perhaps I'm wrong
Reply
#4
in OpenCL/m20710_a0-pure.cl I don't see that ctx.w0[0] is modified directly etc...

why don't you just start with 2 very easy blocks of init/update(s)/final for the first SHA512 hash and init/update(s)/final for the second SHA512 hash.

you could just use

Code:
sha512_init (&ctx2);

u32 second[32];

second[ 0] = ...
...
second[31] = ... // 32 * 4 = 128

sha512_update (&ctx2, second, 128);
sha512_update (&ctx2, salt, salt_len); // or sha512_update_swap () ?
sha512_final (&ctx2);

first rule of optimization/implementation: do not optimize! just make it work with simple/easy code... optimizations come later
Reply