04-21-2021, 03:18 AM
(This post was last modified: 04-21-2021, 03:35 AM by Chick3nman.)
"I mean we all want the same end results right? Optimization! No matter the cost."
We do significant optimization on the backend, but there's only so much we can do without being able to see into the future during an attack. Consider the following example:
You have a rule that has multiple functions, as most rules do, that affect the length. This is not uncommon.
Plaintext: pass
Rule: p5 p2
The above will duplicate the word 5 times, then duplicate that buffer 2 times. The first processing step takes in "pass" and produces "passpasspasspasspass". Then we do the next rule step and we take "passpasspasspasspass" and produce "passpasspasspasspasspasspasspasspasspass" but wait, this is now 40 characters long, and the optimize kernel length limit for MD5 is 31 characters long. What do we do? We actually don't process the second rule at all and return the buffer as is, we've already got code that optimizes that step completely away and returns the plain as it is from the first rule process. https://github.com/hashcat/hashcat/blob/...d.cl#L1188
You'll see here, though, that this is NOT a rejection, it's still going to process the pass from the previous step because the buffer is within bounds and it's basically too late, we've already incurred the penalty of loading the buffer and initializing the hash, we still have to wait for the other threads to finish before we can get a new word. This leads to some interesting situations.
What if your rules look like this:
p5 p3
p5 p4
p5 p7
All three of those are valid, unique rules. And all 3 will get processed and kicked back at the same stage, and all 3 will cause the same candidate to be tested each time. This is a nightmare if you are trying to optimize everything away, as the failure happens too late into the process to throw away and gain speed. Throwing these rules away before they get to the rule processor step in the kernel is by far the best way since we can save the penalty of the word being duplicated.
Now, it's possible that we could do some optimization prior to loading these rules, and that's what i was referring to when i said i had already looked into this a bit. Given the rule p8 p4, we can know that what this actually means is p8*4, and we can store it condensed as pW (p32). Now, if the kernel we are running has a max pw length of 31, like the MD5 optimized kernel does, we can effectively throw that rule away before its even loaded. Likewise, if the rule p5 p4 and p4 p5 create the same plaintext(assuming its valid), then running both of them just creates duplicate work and there's little reason to do so. Removing rules that are "unique" but produce duplicate work is a serious optimization that can be made. This doesn't work for everything, but it can work for a few simple cases like the example above, and could shave significant amounts of the total keyspace off, especially with huge wordlists like you are running. Doing this processing outside of hashcat would not be super difficult, though it would have to rely on you knowing the parameters of your future attack to know what rules to reject. Even without that info, however, condensing rules to identify duplicate final states could be done. I may release a tool to do some of this stuff before i work on implementing them in hashcat, I think we may have some of this stuff already handled for certain rules.
We do significant optimization on the backend, but there's only so much we can do without being able to see into the future during an attack. Consider the following example:
You have a rule that has multiple functions, as most rules do, that affect the length. This is not uncommon.
Plaintext: pass
Rule: p5 p2
The above will duplicate the word 5 times, then duplicate that buffer 2 times. The first processing step takes in "pass" and produces "passpasspasspasspass". Then we do the next rule step and we take "passpasspasspasspass" and produce "passpasspasspasspasspasspasspasspasspass" but wait, this is now 40 characters long, and the optimize kernel length limit for MD5 is 31 characters long. What do we do? We actually don't process the second rule at all and return the buffer as is, we've already got code that optimizes that step completely away and returns the plain as it is from the first rule process. https://github.com/hashcat/hashcat/blob/...d.cl#L1188
You'll see here, though, that this is NOT a rejection, it's still going to process the pass from the previous step because the buffer is within bounds and it's basically too late, we've already incurred the penalty of loading the buffer and initializing the hash, we still have to wait for the other threads to finish before we can get a new word. This leads to some interesting situations.
What if your rules look like this:
p5 p3
p5 p4
p5 p7
All three of those are valid, unique rules. And all 3 will get processed and kicked back at the same stage, and all 3 will cause the same candidate to be tested each time. This is a nightmare if you are trying to optimize everything away, as the failure happens too late into the process to throw away and gain speed. Throwing these rules away before they get to the rule processor step in the kernel is by far the best way since we can save the penalty of the word being duplicated.
Now, it's possible that we could do some optimization prior to loading these rules, and that's what i was referring to when i said i had already looked into this a bit. Given the rule p8 p4, we can know that what this actually means is p8*4, and we can store it condensed as pW (p32). Now, if the kernel we are running has a max pw length of 31, like the MD5 optimized kernel does, we can effectively throw that rule away before its even loaded. Likewise, if the rule p5 p4 and p4 p5 create the same plaintext(assuming its valid), then running both of them just creates duplicate work and there's little reason to do so. Removing rules that are "unique" but produce duplicate work is a serious optimization that can be made. This doesn't work for everything, but it can work for a few simple cases like the example above, and could shave significant amounts of the total keyspace off, especially with huge wordlists like you are running. Doing this processing outside of hashcat would not be super difficult, though it would have to rely on you knowing the parameters of your future attack to know what rules to reject. Even without that info, however, condensing rules to identify duplicate final states could be done. I may release a tool to do some of this stuff before i work on implementing them in hashcat, I think we may have some of this stuff already handled for certain rules.