Speed of crack for WPA/WPA2 hash
For WPA/WPA2 hash crack, There are three major types of attacks (Dictionary & Rule-based  & Mask) attacks.

I was testing what is the fastest attack and i found out that the Dictionary is the slowest one then the other two types. The Rule-based  and Mask attack gave me nearly the same speed.

Is it true that the Mask attack is faster than the the dictionary attack ?

1- My first question is that i understand for mask attack , that the passwords are generated to by the CPU , then it is copied via the PCI express to the GPU, then the GPU start working on the hashcrack. is this correct ? i give example below to the command i use to performe mask attack

hashcat -m 2500 hash.hccapx -a3 -?d?d?d?d?d?d?d?d -w3

So if this is correct why mask attack is much faster than dictionary attack, although both copy the data from CPU to GPU through the PCI express ? in my testing i found out that always the mask attach is faster than dictionary attack.

2- I think that the fastest attack is the Rule-based, because it modify the wordlist inside the GPU so copying is no longer need from CPU to GPU. but to my surprise that i found that some times the mask attack is faster than the Rule-based attack, although the mask attack should be slower because it copy the generated words from CPU to GPU which takes alot of time... i though that Rule-based attack should be faster because all the work is done inside the GPU in terms of modifying the wordlist to many other words.

3-Does the wordlist size have effect on the GPU cracking speed incase of Dictionary attack ? or Rule-based attack? or mask attack ? i think that the bigger size of the wordlist , the more GPU faster speed the rule-based attack will be .. is this correct ? i also noticed that the crack speed decrease incase of bigger size of wordlist for the dictionary attack ? so it is opposite to Rule attack in this case.... and also what about mask attack, does the longer size of mask attack cause speed difference?

4-If we are going to make a sorting from fastest to slowest methods for cracking WPA, would it be 

1- Rule based attack
2- Mask based attack
3-Diectionary attack ?

Thank you.

Michael Robert
There are two ways to measure cracking speed: hashes tried per second (H/s), and passwords cracked per second (Pwd/s). Personally, I think the second is more important.

In an ideal situation, you have a dictionary that has every password you are trying to crack in it and nothing extra. This is the fastest Pwd/s and the slowest H/s. So IMHO, the dictionary attack in an ideal situation is the fastest.

Of course it you won't have an ideal situation in real life. Because, as you said, getting the dictionary words to the GPU is a bottleneck, you can use rules to to have the otherwise [mostly] idle GPU create some variants of each dictionary word. This will greatly increase the H/s speed, and probably increase the Pwd/s speed beyond what a true life dictionary attack would give. Obviously, not all rules are going to be equally effective, and picking the set that would give you the best result is both an art mixed with some luck.

Masks are useful, and for me, give the fasted H/s speed. But their Pwd/s speed seems slower than rules. They are useful, as not all passwords are based on a [mangled] dictionary word. There are passwords you'll never find with a dictionary/rule attack. There are also passwords a mask attack would likely be more effective than a dictionary/rule attack, such as password that's just numbers. (Dictionaries can have numbers as words.)

The trickiest part is getting the good mask(s). PACK has a nice mask generator, but that is based on what you've already cracked by other means, including other mask attacks. However, to be effective, PACK requires your cracked passwords be representative of all the passwords, so feeding it passwords cracked solely by masks isn't likely to give good results.

Finally, anything you know about the password set can be used to increase the Pwd/s. For example, if the passwords are from a site/company that requires one capital letter and one lowercase, and the password must be 8-12 characters, you don't need to try a mask ?l?l?l?l?l?l?l . Or if you know the person who created the password doesn't believe 0 is a number, that can be used in a custom character set. (I'm not kidding about the 0.)
Thank you, but your post doesn't answer my questions.....

ATOM, can you please help??? your reply to my post is greatly appreciated.
The short answer is: I/O
You shouldn't underestimate how slow it is to read from disk compared to e.g. generating the password candidates with rule/mask engine directly on GPU (fast hashes).
For slow hashes like WPA the difference might not be that much (because the time the algorithm takes itself will be a little bit more significant/greater) but the time needed to read from disk will still influence the H/s.
It is also needless to say, that for instance if you run a mask, in general, only the byte at one position will change between each password candidate, while if you run a dictionary attack the whole password candidate/line will change.

If the speed is slower than expected for dictionary attack, you should consider adding some rules to it, such that the GPU has enough to do (you should keep it busy)... while the dictionary is read from disk and or the password candidates are transfered over PCI.

Even with the fastest SSD (or even ramdisk) the I/O will still take longer than generating the password candidates on-the-fly (in registers/RAM)... I/O is slow on it's own and all the additional computer instructions involved with it (yes, even with ramdisk) doesn't make it any faster.
(see for instance the comparisons between register, ram, ssd, hdd here: https://gist.github.com/jboner/2841832#f...cy-txt-L15 )

BTW: what rsberzerker is actually not wrong at all. Sometimes it makes sense to not think about raw speed but about how efficient the attack is. Even if the attack tests less hashes per seconds, it could be much more efficient/"faster", because it is much more targetted/specific. If you run a were specific dictionary (with some very well-working rules) against a specific hash list, it might be much more clever compared to running a full brute-force (even if the latter is "much faster" if we look at the H/s).
You shouldn't underestimate how slow it is to read from disk compared to e.g. generating the password candidates with rule/mask engine directly on GPU (fast hashes).

You have mentioned that incase of mask engine, then the password generation is it done by the GPU and not done by the CPU , but according to hashcat site https://hashcat.net/wiki/doku.php?id=bru...s_original , it clearly mention that the password generation incase of mask attack is done by the CPU and then copied to the GPU through the PCI... could you please correct me if i am wrong ?

Incase of mask attack ? where is the passwords generated ? CPU or GPU ?

Second question, what is faster ?
Example1: hashcat -m 2500 hash.hccapx -a3 -?d?d?d?d?d?d?d?d -w3
Example2: mp64 ?d?d?d?d?d?d?d?d | hashcat-m 2500 bla.hccap

some times the mask attack is done by hashcat as example1 and some time it is done through piping like example2. what is the better and faster ?

Thank you.
The wiki article you mention is about maskprocessor, not hashcat. If you use mp the candidates are generated on cpu.

Using hashcat's a3 mode is always better than using mp if you don't need any special features of mp.
(09-02-2017, 08:59 PM)undeath Wrote: The wiki article you mention is about maskprocessor, not hashcat. If you use mp the candidates are generated on cpu.

Using hashcat's a3 mode is always better than using mp if you don't need any special features of mp.

Ok... so I understand from your answer that when using hashcat mask engine such as "hashcat -a3" then it use the GPU to generate the passwords while if I used the MP64  or Princeprocessor then it use the CPU and Not GPU. Please confirm my understanding. (Although I didnot find on hashcat wiki that hashcat buildin mask attack uses GPU)

Incase yes, then what considered faster in terms of h/s ? Mask engine attack or rule attack? Since both are processed by the GPU.

Thank you.
Yes, if you pipe anything into hashcat (mp/pp/…) it's using your cpu.

In general mask attack will be the fastest mode but with sufficient rules the speed of a wordlist attack should be pretty close.
As already mentioned above, the hash algorithm itself determines where/how the password candidates are generated for mask attacks (-a 3).
The hashcat source code distinguishes between ATTACK_EXEC_INSIDE_KERNEL and ATTACK_EXEC_OUTSIDE_KERNEL.
You can also just look at the OpenCL/ folder and see which hash types use a *_a3.cl kernel and which do not (the kernel file name is in case of ATTACK_EXEC_INSIDE_KERNEL m[hash_type]_a3.cl or in case of ATTACK_EXEC_OUTSIDE_KERNEL m[hash_type].cl).

The main reasons for this differences are that, as already mentioned above, some hashing algorithms are themself much more demanding/slower/use more instructions/iterations and therefore here the core computation is the main bottleneck (and not the password candidate generation/read/transfer etc) and the second reason is that hashcat needs to keep the runtime of a single kernel run low to avoid kernel timeouts (default operating systems settings and driver settings set a very low timeout for kernel runs and if this time is exceeded you will get a warning that your system is "unresponsible" - and of course it could also be the case that it really gets unresponsible with high workload settings - ... furthermore the user also wants that the status display updates regularily and frequently such that she can see the progress).