Duprule: detect & remove duplicate rules
#1
Duprule is a tool to detect & remove duplicate rules, inspired from issue 301 in hashcat.

Link

https://github.com/0xbsec/duprule/


How does it works ?

TL;DR: Each rule change is mapped, and a unique id is generated for each rule with functions count.

The mechanism is like this:

- A blank map is created with $n ( from 1 to 37 ) default characters.
- Each rule change will be applied to the map.
    Example rule: 'u', will change all characters cases from '?' ( unknown ) to 'u' ( upper case ).
    'sab', will add {'a' -> 'b'} to the map. And same logic apply for the other rules.
- An id is generated from the map.
- The ids are compared to detect duplicate rules.
- The rule with the least functions count will be chosen. ( there's a plan to add readability  to select the rule, check issue #4 for updates ).

Which rules are supported ?

Currently all rules on this page are supported except: 
    - Memory rules: X, 4, 6, M
    - Reject plains rules
    - E

Usage

git clone https://github.com/0xbsec/duprule.git
perl duprule.pl < [rule file]

duprule.pl take input rules from STDIN.
Example: 
perl duprule.pl < /tmp/rockyou.rule
Will print unique, non ordered, rules to STDOUT. And save duplicate rules to 
duplicates.txt


This is still in development and some features are not supported. Any feature/test/bug report is welcomed.

I'd like to hear your thoughts about the tool Big Grin
#2
Sounds interessting. Can you show some of the results and explain them with words?
#3
@atom Thanks for your reply!
Here are some of the duplicates from rules/T0XlCv1.rule ( complete list: here ):

perl duprule.pl < ../hashcat/rules/T0XlCv1.rule

1. ^a^i^d, i0d i1i i2a
2. x32, x32+2
3. :, tt
4. D2 D2, D3D2
5. '3d, x03d
6. *22se3, se3
7. x31Y3i7&i5vL9'4, x31
8. ] { [ }, D1], ],1[

Explanation:

1. ^a^i^d will have this change on map:
    - push elements with characters 'a', then 'i', then 'd' to the start of the word
i0d i1i i2a

    - insert element with character 'd' at position 0, 'i' at position 1, 'a' at position 2.
2. x32
    - from position 3, take two characters, so the map will be reduced to these two characters
x32+2
    - as above, but increase character at position 2 ( character number 3 ) by 1 ascii value. As the third character doesn't exist they are duplicates.
3. :
    - change nothing.
tt

    - the first 't' will toggle all the cases, 'u' -> 'l', 'l' -> 'u', 'd' ( default, unknown case )  -> 'b' ( the opposed case of 'd' ), and the second 't' will toggle all the cases back. That's why it's duplicate with :.


4. D2 D2
    - Delete element at position 2, then with the new word, delete element at position 2 ( it was the third element )
D3D2
    - Delete element at position 3, then delete the element at position 2.
5. 'n have the same effect as x0n.
6. *nn will swap position n with position n, so nothing will happen.
7. x31 will take 1 character from position 0, so all changes on elements with position > 0 will have no effects.
8.
] { [ }
    - Delete last element.
    - rotate word left.
    - Delete element at position 0 ( it's old position was 1 ).
    - rotate work right.
D1]
    - Delete element at position 1.
    - Delete last element.
],1[
    - Delete last element.
    - Replace element at position 1 with element at position 0
    - Delete element at position 0.

For more examples, please have a look at this test, and this test will give an overview over the map.
#4
Looks good, thanks for contribution.