Performance drop with partially known long plain NTLM
#1
Hi All,

Not a request for help as such, mainly interested to know if the situation I have is expected and from an academic perspective why it is occurring.

I've been trying to crack an unknown NTLM hash (set as a challenge by a friend) using a single GTX 670 and cudaHashcat-lite64.exe on Windows with various masks. I've been focusing on 7-8-9-10 character masks and getting consistent speeds of about 2850 M/s (which I *think* is about normal for this card (confirmation would be reassuring)).

Today I changed my tactics based on new information he provided - namely it's a 12 or 13 character plain and I now know the first five characters. So I modified my mask to be "<5-known-characters>?1?1?1?1?1?1?1?1" (where ?1 is ?l?d?s) and set it to solve using 12 followed by 13 characters. My surprise came when I saw the new hashing performance was only about 120 M/s.

My understanding is that the algorithm shouldn't depend on the plain length that strongly (borne out by an attempt at a 13 character ?l mask which gave 2850 M/s). It seems like the bottleneck comes about due to effectively running a hybrid attack using just masks.

The interesting thing to me is that the program is now reporting about 88% GPU utilisation rather than about 99% (though that wouldn't account for the 30x drop in performance by itself).

Any thoughts, or is this behaviour expected?
#2
this is expected because you aren't giving the left side enough work to do. i'm surprised your gpu utilization is that high, when i try your mask i only get about 2% utilization on one GPU.
#3
same here, I did find that plus runs it faster in -a 6 than lite does.
#4
I had assumed that inputting the mask as a single string of constants and variables would in effect create a hybrid dict+mask attack with only one dictionary entry (which to me seems computationally equivalent to setting up a dict+mask attack) - but I suppose the way the algorithms in lite are implemented doesn't take the 'constant' left hand side into account and focus its effort on the 'variable' portion?
#5
I hate to double post, but I did try using -plus with a single entry dictionary containing my five letters, a 7 character mask, and the single hash and only achieved about 2.8 M/s at best (compared to about 1600 M/s in -plus with a standard mask attack). Am I to understand that the hybrid+mask algorithm still suffers from a similar left-side unoccupied performance bottleneck?
#6
If you partially know the left part of the password you better not use oclHashcat-lite. Its built for maximum performance, so its working with so called reversal techniques. These techniques base on holding all parts of the passwords constant except the first 4 chars. Now if you have exactly these static there is no benefit in reversal and you will face a performance loss.

This is why you should use oclHashcat-plus, but not with -a 6 or -a 7 mode to emulate the static part. That will not work. The best way to archieve full performance (or nearly full) that plus can give you is that you have the known part of the password as salt to your hash and then use a hash-type like md5($salt.$pass) or sha1($salt.$pass).

Another option is a multi-rule. This should be used if your hash is not a raw hash so that oclHashcat-plus does not support an added salt for it, like NTLM or descrypt. You add a rule like i0h i1e i2l i3l i4o which prepends the word "hello" to all your candidates. Remember to use this with a multi rule. So you have to use another rule in combinations like -r rules/best64.rule -r your.rule
#7
Thanks atom. I'm trying to crack NTLM, so I'm looking at your multi-rule suggestion. I can understand (and have tested) how that would work if I had a wordlist that I could apply the prepend rule to, and know I could generate such a thing using maskprocessor, but I was under the impression that using huge word lists is a bad idea with GPGPU? I'm also not 100% clear what the purpose of the additional rule file is if I was cracking in -a 0 - my guess was that it acts as a pseudo mask by having all combinations of append rules?

I can only conclude that I haven't quite grasped where "my candidates" are coming from - I tried applying a prepend rule to -a 3 (the program ignored the rule), and running -a 0 with both a prepend and append rule together (with and without an empty dictionary, -plus threw up a lack of dictionary error, unsurprisingly).

In short, could you let me know which attack mode you were referring to in your multi-rule suggestion?
#8
rules work only with -a 0