When I spoke with Atom about this a while back - in relation to this exact issue - he explained that there was no way to "conditionalize" (poor word, I know) the execution of rules, particularly on the GPU. For example, truncating a short word; if the truncation is at a point longer than the word length, the new password length would be set to the "truncated" length, creating a null-appended value.
There is no way to "fix" this, and retain high speed.
As a result, I had to change how I used oclHashcat - I never use it with the --remove option, and always post-check the "found" passwords against the original hashes.
Moving to the new output format doesn't "fix" the unfixable problem of conditional-rules-on-gpu, but it does at least show what was found, and does offer a much better long-term storage of found passwords in the output file (and gives hope to the re-use of this data).
This _does_ require oclHashcat to pass back the password length correctly to the output, *after* rule application. It does not do so currently, and assumes that the input password length is the same as the output password length (I think). Either that, or the length is not preserved, and a strlen() is being used - which is not a good idea. Most hash algorithms (including MD5) do not care which bytes are being hashed, and use a "pointer-and-length" format; we need to be able to represent that on input (in the dictionaries), and in the output.
In this example, seeing the output:
c4ca4238a0b923820dcc509a6f75849b:1
a933d13f81649bebe035dc21f4002ff1:$HEX[310032]
makes it perfectly obvious what happened... and you can then reuse
$HEX[310032]
in a password dictionary file to indicate the 3 character string 1 NUL 2.
This is even more significant with passwords ending in a CR/CRLF.
The ISW 2012 cracking challenge, for example, had more than 264,000 passwords which required hex encoding.
There is no way to "fix" this, and retain high speed.
As a result, I had to change how I used oclHashcat - I never use it with the --remove option, and always post-check the "found" passwords against the original hashes.
Moving to the new output format doesn't "fix" the unfixable problem of conditional-rules-on-gpu, but it does at least show what was found, and does offer a much better long-term storage of found passwords in the output file (and gives hope to the re-use of this data).
This _does_ require oclHashcat to pass back the password length correctly to the output, *after* rule application. It does not do so currently, and assumes that the input password length is the same as the output password length (I think). Either that, or the length is not preserved, and a strlen() is being used - which is not a good idea. Most hash algorithms (including MD5) do not care which bytes are being hashed, and use a "pointer-and-length" format; we need to be able to represent that on input (in the dictionaries), and in the output.
In this example, seeing the output:
c4ca4238a0b923820dcc509a6f75849b:1
a933d13f81649bebe035dc21f4002ff1:$HEX[310032]
makes it perfectly obvious what happened... and you can then reuse
$HEX[310032]
in a password dictionary file to indicate the 3 character string 1 NUL 2.
This is even more significant with passwords ending in a CR/CRLF.
The ISW 2012 cracking challenge, for example, had more than 264,000 passwords which required hex encoding.