11-09-2018, 08:30 PM
(11-09-2018, 07:57 PM)atom Wrote: Therefore I'll mark the seek function as optional. In case of PCFG, I'll simply call the sc_pcfg_next() that many times I need to seek, simply because there's no way around this "feature" in a distributed environment (not even in just a multi-gpu environment).
This will create a bottleneck but I don't see a different solution to this.
I apologize since I don't think I was clear
1) Just like there are many types of Markov attacks, (Hashcat Markov, JtR/Narayanan Markov, JtR Incremental, OMEN, etc), there are many ways to generate guesses with PCFGs. These different ways have pluses and minuses.
2) There is no single "best" way to generate guesses with PCFGs
3) In my previous guess generators, I prioritized probability order since I was approaching it mostly from a research/academic standpoint. I wanted to have the most "accurate" password guess generator, at the expense of many other things such as performance. With this approach there is no good index function.
4) If you don't care about probability order, there are other methods that have indexing functions of various speeds. Heck, I wrote a proof of concept PCFG rainbow table back in the day when rainbow tables were still relevant. What this means is that a PCFG based approach could support sc_pcfg_seek
5) I need time to think about the best way to go about designing a version of PCFG that supports indexing and sc_pcfg_seek. It's not that it can't be done. It's more like there's a lot of ways it *can* be done and I'd like to weight the trade-offs between them. I'm sure I'll be asking for feedback/suggestions/help/advice on this as well since as I said, I mostly come from an academic background, so I value input from practitioners and the wider pw community.
6) I'd also like a new name for this approach so that it doesn't get confused with the probability order version.
7) If there is yet another version of an interface of hashcat's slow guess framework that doesn't require a "seek" function, I think that would be valuable for future work. My gut feeling though is that the Hashcat user-base would be more interested in an index version that has better support for restarting sessions, is faster with less memory usage, etc, even though it is significantly less accurate/precise. What that means is I'll probably focus on providing you a version of a pcfg guess generator that supports the five functions you requested.
8) Once again to set expectations, I probably will not start work on coding this until January. Talking about it is a lot easier than doing it ;p
9) To further set expectations, I'm a slow coder who writes slow code as can be seen by the current PCFG development history.
On a side note:
Quote:First thing that pops into my mind is it would be ideal to allow multiple of such "masks" to be executed sequentially. And then of course some sort of reader/trainer to create such masks and eventually order them by probability.
That feature actually existed in the version 2 of the PCFG guess generator, and is also roughly how the CMU version of the PCFG cracker works too. I never ported it to version 3, but that approach is certainly viable. There's other optimizations you can take to that approach as well such as increasing "smoothing" of the training data so that each base structure/mask would generate more guesses, thus reducing the on-disk space required. As I said earlier, there's lots of ways to go about this and each way has it's own set of trade-offs that need to be balanced