Stats Processor Spitting Out Junk
#1
Hello,

I have an issue. I am using the latest hashcat utils and stats processor. I have an AMD Radeon 5650 with the latest APP SDK. Installing this fixed my initial problem where nothing appeared on the status screen. I'm using Windows 7 64 bit. The issue I am having now is with the stats processor. I used the example Atom showed for statsgen.exe and stats processor using the the rockyou.txt. My stats file size comes out correct and I am using windows.

The Example: http://hashcat.net/forum/thread-1265.html

When I enter the following command...
\statsprocessor.exe --pw-min 5 --pw-max --threshold 400000 rockyou.hcstat | C:\HashcatGUI_042\hashcat-utils-1.0\head -20....This is what I get.
0arin
0onan
0enan
0inan
0unan
0rina
0love
08712
0hana
01012
02341
09120
00012
05012
04120
03012
0nger
0mana
06012
07120

Here is my output with the the same example but with this parameter added after the .hcstat file... ?l?l?l?l?l?l?l?l. (Still not words but Junk)

marin
monan
menan
minan
munan
mrina
mlove
mhana
mnger
mmana
mshan
mylle
mtana
mchan
mdana
mbana
mpana
mkina
mjana
mweer

No matter what I do the stats processor seems to output complete randomness with no words that make sense (I thought that was a part of the whole use of stats processor) and the threshold seems to have absolutely not impact and if I just let it go through all the iterations it seems to be doing a complete brute force with my specified --pw-min/max. I seem to be unable to make use of markov chain depth. When I use these parameters -1 ?l?d ?1?1?1?1?1...I do only get letters and digits but I still can't apply a threshold limit and I get complete junk even with my rockyou.hcstat file and the iterations proceed like bruteforce...forever and ever. Has anyone had this problem before? I am thoroughly confused and wondering if there is something wrong with my .hcstat files or if the stats processor is not working right on my system for some reason. There is nothing significantly usable for any intelligent attack with my stats processor attempts. It's now working it seems to be bruteforcing.

Here is another example on the website my results were the same here but I still don't understand why I have received the results above. Am I missing something?

The example given by: http://hashcat.net/wiki/doku.php?id=statsprocessor
My command and results:

\HashcatGUI_042\hashcat-utils-1.0\statsprocessor-0.083>sp64
.exe --pw-min 5 --pw-max 5 hashcat.hcstat ?l?l?l?l?l | C:\HashcatGUI_042\hashcat-utils-1.0\
head -9
sange
songe
serin
singe
sunge
srane
shane
slane
snder

Any help would be much appreciated. I am just trying to output something productive Thanks Everyone Smile

Update: I realized that the threshold only has no impact if it is above the number of characters I am using. I saw a major difference when with using --threshold 13 on this character set -1 ?l...the progression was now different I could see the threshold working. Can anyone elaborate. So I understand this better. I have the idea not that the threshold or number of characters added to the markov table has to be in the range of your character set is this correct. I wondering what I can do to produce results that are more realistic instead of random. This didn't work too bad but could I be doing something better. Thanks

~Regards~

dayvjohnson
#2
statsprocessor is built into oclHashcat and used by default anywhere a mask is used, so the only reason to use statsprocessor is to add oclHashcat-style per-position markov support to other applications that do not have it (cpu hashcat, john the ripper, etc.) i mention this to make sure you are not piping statsprocessor into oclHashcat since you mentioned your GPU and the APP SDK.

oclHashcat's markov implementation is per-position and based around mask attacks. Therefore the threshold needs to be less than the number of combinations for each position. For example ?l only has 26 possible combinations so -t needs to be < 26 to have any effect.

Without -t, or with a -t value greater than the possible number of combinations in each mask position, oclHashcat will simply do a probabilistically-ordered brute force. So you'll still be brute forcing the same keyspace, just in a different order.

In either case they aren't random, even if they appear to be nonsensical. They are probabilistically ordered based on the character position, adjacent character, and length. The reason you may not be seeing very many real words is because you are currently using a length of 5, and there probably aren't that many 5-character passwords in your training set.
#3
Wow, thank you very much I really understood all of that. I really appreciate having a solid thorough answer. Your speculation was right by the way. That is exactly what I was doing...piping stats processor into oclHashcat. I totally understand I changed my length up and am getting real words now. Right I am not doing that but just viewing the different combinations after entering this command: \sp64.exe --pw-min 8 --pw-max8 --threshold 16 rockyou.hcstat ?l?l?l?l?l?l?l?l. Yes thank you am seeing a huge difference and am quite impressed. I have another question now. Am I correct in thing that I can use the stats processor to produce a statistically better dictionary with more possible likely combinations for passwords then clean up the dictionary to reduce confusion or dupes in statistical attack like mentioned here.

https://hashcat.net/forum/thread-1285.html

I was piping like you mentioned in an attempt to utilize a more effective dictionary for attacks. Now I know more about how markov chain is used in various implementations thanks for that. So am doing this all right and thinking about this in the right way now according to what I'm wondering if follow the process in the above and learn about it? Again I appreciate the help this is incredible to me I'm hooked. PS I was also piping because I not only wanted a better dictionary output but I was afraid of the possibility of a monstrous output of possibilities creating a ginormous dictionary. Do have any words of wisdom on this subject. I want to create better dictionaries that aren't junk. E.g. I have a dictionary that is like 93 gigs but useless to me because its just useless in its condition Eg. johnny,appleseedWAYtoL8ng...etc...How can I be aware of and control any chances of outrageous output here.

Regards,

~dayvjohnson~
#4
Since you have to have good wordlists to train statsprocessor, it doesn't make much sense to use statsprocessor to build a wordlist. As stated above, its primary purpose is to extend oclHashcat-style per-position markov attacks to other programs. So while it doesn't make sense to pipe statsprocessor into oclHashcat, it does make sense to e.g. pipe it into John the Ripper.

Nothing makes better wordlists than real world passwords, so the best way to create better wordlists is to simply crack more passwords.
#5
Thanks Smile I understand.