combine rules without duplicates?
#1
Questions: 
Is there a way to combine rules without getting an extremely large amount of duplicates? 


For example, taking the leetspeak.rule as:
sa4
sa@
sb6
sc<
sc{
se3
sg9
si1
si!
so0
sq9
ss5
ss$
st7
st+
sx%


but what I'm actually looking for is the combination of these rules which resolves to these 32 viable combinations (which I created by hand since I couldn't come up with an automatic way to do it. How can I create this list automatically without duplicates?): 
sa4 sb6 sc< se3 sg9 si! so0 sq9 ss$ st+ sx%
sa4 sb6 sc< se3 sg9 si! so0 sq9 ss$ st7 sx%
sa4 sb6 sc< se3 sg9 si! so0 sq9 ss5 st+ sx%
sa4 sb6 sc< se3 sg9 si! so0 sq9 ss5 st7 sx%
sa4 sb6 sc< se3 sg9 si1 so0 sq9 ss$ st+ sx%
sa4 sb6 sc< se3 sg9 si1 so0 sq9 ss$ st7 sx%
sa4 sb6 sc< se3 sg9 si1 so0 sq9 ss5 st+ sx%
sa4 sb6 sc< se3 sg9 si1 so0 sq9 ss5 st7 sx%
sa4 sb6 sc{ se3 sg9 si! so0 sq9 ss$ st+ sx%
sa4 sb6 sc{ se3 sg9 si! so0 sq9 ss$ st7 sx%
sa4 sb6 sc{ se3 sg9 si! so0 sq9 ss5 st+ sx%
sa4 sb6 sc{ se3 sg9 si! so0 sq9 ss5 st7 sx%
sa4 sb6 sc{ se3 sg9 si1 so0 sq9 ss$ st+ sx%
sa4 sb6 sc{ se3 sg9 si1 so0 sq9 ss$ st7 sx%
sa4 sb6 sc{ se3 sg9 si1 so0 sq9 ss5 st+ sx%
sa4 sb6 sc{ se3 sg9 si1 so0 sq9 ss5 st7 sx%
sa@ sb6 sc< se3 sg9 si! so0 sq9 ss$ st+ sx%
sa@ sb6 sc< se3 sg9 si! so0 sq9 ss$ st7 sx%
sa@ sb6 sc< se3 sg9 si! so0 sq9 ss5 st+ sx%
sa@ sb6 sc< se3 sg9 si! so0 sq9 ss5 st7 sx%
sa@ sb6 sc< se3 sg9 si1 so0 sq9 ss$ st+ sx%
sa@ sb6 sc< se3 sg9 si1 so0 sq9 ss$ st7 sx%
sa@ sb6 sc< se3 sg9 si1 so0 sq9 ss5 st+ sx%
sa@ sb6 sc< se3 sg9 si1 so0 sq9 ss5 st7 sx%
sa@ sb6 sc{ se3 sg9 si! so0 sq9 ss$ st+ sx%
sa@ sb6 sc{ se3 sg9 si! so0 sq9 ss$ st7 sx%
sa@ sb6 sc{ se3 sg9 si! so0 sq9 ss5 st+ sx%
sa@ sb6 sc{ se3 sg9 si! so0 sq9 ss5 st7 sx%
sa@ sb6 sc{ se3 sg9 si1 so0 sq9 ss$ st+ sx%
sa@ sb6 sc{ se3 sg9 si1 so0 sq9 ss$ st7 sx%
sa@ sb6 sc{ se3 sg9 si1 so0 sq9 ss5 st+ sx%
sa@ sb6 sc{ se3 sg9 si1 so0 sq9 ss5 st7 sx%


if I run the command to join rules together, I get an uncontrollable number of duplicates, as it's joining every rule to every rule. 
hashcat -r leetspeak.rule -r leetspeak.rule -r leetspeak.rule -r leetspeak.rule --stdout wordlist

So if I want to take my list of '32 leetspeak rules' and add a "Capitalize the first letter and lower the rest" rule then that should equal 64 total rules, not the crazy number of results I actually get. 

What am I doing wrong?
Reply
#2
You could use mp64 to generate them, maybe?

https://hashcat.net/wiki/doku.php?id=rul...kprocessor

Might still have to dedupe it a little after, depending
~
Reply
#3
(08-31-2020, 11:53 PM)royce Wrote: You could use mp64 to generate them, maybe?

https://hashcat.net/wiki/doku.php?id=rul...kprocessor

Might still have to dedupe it a little after, depending

but even with this how can i dedup them? Best I've been able to do so far is check them manually line by line 
Reply
#4
Dedupe of text on the command line is a largely solved problem. Depends on your platform. 'sort -u' on Unix-likes covers most use cases. On Windows, 'sort.exe /unique' seems roughly equivalent.
~
Reply
#5
@royce: I assume the bigger problem is with rules that are equivalent, yet do not have the same string representation, eg "sa4 sb6" and "sb6 sa4".

I guess you don't want to simply combine them but to create permutations of a list of rules. A small custom script should solve your problem.
Reply
#6
no, I think in this case it's different.

There are 11 sets and some of them are a set (or a OR) , like this:
Code:
(['sa4', 'sa@'], ['sb6'], ['sc<', 'sc{'], ['se3'], ['sg9'], ['si1', 'si!'], ['so0'], ['sq9'], ['ss5', 'ss$'], ['st7', 'st+'], ['sx%'])

so for the "a" replacement, there are 2 alternatives "sa4" and "sa@", but only one should be used within a rule line, but both should be run at the end (therefore there are 2 alternatives).

The main problem for the maskprocessor or mask file generation approach is that hashcat only allows 4 custom charsets, so you would need to do somethink like this:
Code:
hashcat --stdout -a 3 -o my.rule -1 'a@' -2 '<{' -3 '1!' -4 '5$' 'sa?1 sb6 sc?2 se3 sg9 si?3 so0 sq9 ss?4 st7 sx%'
hashcat --stdout -a 3 -o my.rule -1 'a@' -2 '<{' -3 '1!' -4 '5$' 'sa?1 sb6 sc?2 se3 sg9 si?3 so0 sq9 ss?4 st+ sx%'

you could also use a hashcat mask file (.hcmask) instead of running 2 commands.

The trick is to use all the 4 allowed custom charsets and use "st7" and "st+" (or 1 of the charsets that couldn't fit in the 4 custom charsets) as separate commands or lines in the hcmask file.
Reply
#7
There's also this project, that tries to detect rules with redundant results:

https://github.com/0xbsec/duprule/
~
Reply
#8
(09-01-2020, 05:04 PM)royce Wrote: There's also this project, that tries to detect rules with redundant results:

https://github.com/0xbsec/duprule/

I downloaded this, installed Rust and ran this on some of the default rule sets, as well as custom rules with known duplicates. Unfortunately, it only returned the same rules that were in the input, except without any line breaks. I posted the issue here https://github.com/0xbsec/duprule/issues/32   This project does seem to show very high potential though

Once I hit that wall, that's when I posted here thinking to myself 'I can't be the only person that wants to combine rules without a million duplicates'
Reply
#9
(09-01-2020, 02:36 PM)philsmd Wrote: no, I think in this case it's different.

There are 11 sets and some of them are a set (or a OR) , like this:
Code:
(['sa4', 'sa@'], ['sb6'], ['sc<', 'sc{'], ['se3'], ['sg9'], ['si1', 'si!'], ['so0'], ['sq9'], ['ss5', 'ss$'], ['st7', 'st+'], ['sx%'])

so for the "a" replacement, there are 2 alternatives "sa4" and "sa@", but only one should be used within a rule line, but both should be run at the end (therefore there are 2 alternatives).

The main problem for the maskprocessor or mask file generation approach is that hashcat only allows 4 custom charsets, so you would need to do somethink like this:
Code:
hashcat --stdout -a 3 -o my.rule -1 'a@' -2 '<{' -3 '1!' -4 '5$' 'sa?1 sb6 sc?2 se3 sg9 si?3 so0 sq9 ss?4 st7 sx%'
hashcat --stdout -a 3 -o my.rule -1 'a@' -2 '<{' -3 '1!' -4 '5$' 'sa?1 sb6 sc?2 se3 sg9 si?3 so0 sq9 ss?4 st+ sx%'

you could also use a hashcat mask file (.hcmask) instead of running 2 commands.

The trick is to use all the 4 allowed custom charsets and use "st7" and "st+" (or 1 of the charsets that couldn't fit in the 4 custom charsets) as separate commands or lines in the hcmask file.

This is exactly what I'm referring to. 

I haven't been able to come up with a method (like described above) that's versatile/reproducible without me typing each character or each rule 1 by 1. 

My issue is that I haven't found a single rule collection that can take a 4 letter word like "food" and turn it into "Fo0d1" or even "Fo0ds" ..... this seems like a very big shortcoming with the current rule lists. So I was hoping if I could combine some already written rule sets, like upper case first letter combinations and leet speak combinations and add different numbers/characters to the end, then I'd be in better shape. But I haven't been able to do it without exponentially growing duplicates.
Reply
#10
well, that sounds like a very different problem compared to the problem we were discussing above to "just" have some alternatives/sets of replacement for a single character (like "try both sa4 and sa@").

For instance you mention to turn "food" into "Fo0d1" . This for instance isn't even a normal leetify, it's actually 3 very different set of rules: an uppercase rule, an append and a random replacement, because it doesn't replace all "o" with "0", but just a single/random instance of it. The problem is that people often do not think about this very deeply, but there are actually way too many possibilities to replace something to turn into something else, uncountable ways to do "some leet speak combined with other rules".

It's true that hashcat's rule engine is very specific, minimalistic and optimized to work fast etc etc, so yeah, there are some disadvantages... but it's good to have a rule engine that only supports rules that do some deterministic mangling, it's not good to have a rule that just replaces "some random char" etc.
That said, hashcat supports a random rule generator that generates a set of random rules with -g xxx (where xxx is the number of rules). This of course doesn't use rules that are not listed in the wiki page, but it just generates some random rules from this list https://hashcat.net/wiki/doku.php?id=rule_based_attack .

Also see this: https://hashcat.net/wiki/doku.php?id=rul...onal_rules , where we added a feature to replace instance x of a specific character, but this only works with -j/-k (not with -r).
Reply