Wordlist optimisation based on ruleset
#8
Hey all I dont know if this is faster or not. But for the issue of really large file sizes, and speed this works really well for me.

For example something like this would trim out anything but letters and then unique the values

Code:
cat WORDLIST | sed 's#[^a-zA-Z]##g' | uniq > OUTPUT

WORDLIST

Quote:Test010!
test2
$$3tests


OUTPUT

Quote:Test
test
tests



Albeit is not very user friendly to figure out how to use, it does open up other possibilities like if your wordlist is compressed with gzip or xz or something you can use `zcat` or `xzcat` instead of `cat`. and then pipe it back into a compression at the end. well I hope this helps someone. you can also throw the `-i` flag on uniq command and have it strip out the case then you'd be left with:


OUTPUTCASE
[font=Tahoma, Verdana, Arial, sans-serif]
Quote:[/font]test
tests[font=Tahoma, Verdana, Arial, sans-serif]
[/font]



[font=Tahoma, Verdana, Arial, sans-serif]NOTE: the above is on linux natively and i think mac also. On windows you can use the same commands too if you install gitforwindows and make sure you choose to install gitbash in the options.[/font]
Reply


Messages In This Thread
RE: Wordlist optimisation based on ruleset - by pdoctor - 08-12-2020, 11:36 AM