The short answer is: I/O
You shouldn't underestimate how slow it is to read from disk compared to e.g. generating the password candidates with rule/mask engine directly on GPU (fast hashes).
For slow hashes like WPA the difference might not be that much (because the time the algorithm takes itself will be a little bit more significant/greater) but the time needed to read from disk will still influence the H/s.
It is also needless to say, that for instance if you run a mask, in general, only the byte at one position will change between each password candidate, while if you run a dictionary attack the whole password candidate/line will change.
If the speed is slower than expected for dictionary attack, you should consider adding some rules to it, such that the GPU has enough to do (you should keep it busy)... while the dictionary is read from disk and or the password candidates are transfered over PCI.
Even with the fastest SSD (or even ramdisk) the I/O will still take longer than generating the password candidates on-the-fly (in registers/RAM)... I/O is slow on it's own and all the additional computer instructions involved with it (yes, even with ramdisk) doesn't make it any faster.
(see for instance the comparisons between register, ram, ssd, hdd here: https://gist.github.com/jboner/2841832#f...cy-txt-L15 )
BTW: what rsberzerker is actually not wrong at all. Sometimes it makes sense to not think about raw speed but about how efficient the attack is. Even if the attack tests less hashes per seconds, it could be much more efficient/"faster", because it is much more targetted/specific. If you run a were specific dictionary (with some very well-working rules) against a specific hash list, it might be much more clever compared to running a full brute-force (even if the latter is "much faster" if we look at the H/s).
You shouldn't underestimate how slow it is to read from disk compared to e.g. generating the password candidates with rule/mask engine directly on GPU (fast hashes).
For slow hashes like WPA the difference might not be that much (because the time the algorithm takes itself will be a little bit more significant/greater) but the time needed to read from disk will still influence the H/s.
It is also needless to say, that for instance if you run a mask, in general, only the byte at one position will change between each password candidate, while if you run a dictionary attack the whole password candidate/line will change.
If the speed is slower than expected for dictionary attack, you should consider adding some rules to it, such that the GPU has enough to do (you should keep it busy)... while the dictionary is read from disk and or the password candidates are transfered over PCI.
Even with the fastest SSD (or even ramdisk) the I/O will still take longer than generating the password candidates on-the-fly (in registers/RAM)... I/O is slow on it's own and all the additional computer instructions involved with it (yes, even with ramdisk) doesn't make it any faster.
(see for instance the comparisons between register, ram, ssd, hdd here: https://gist.github.com/jboner/2841832#f...cy-txt-L15 )
BTW: what rsberzerker is actually not wrong at all. Sometimes it makes sense to not think about raw speed but about how efficient the attack is. Even if the attack tests less hashes per seconds, it could be much more efficient/"faster", because it is much more targetted/specific. If you run a were specific dictionary (with some very well-working rules) against a specific hash list, it might be much more clever compared to running a full brute-force (even if the latter is "much faster" if we look at the H/s).