it should be very easy to do this w/ for instance running some (shell) commands.
The only thing you should consider:
1. why not sort and unique the dict beforehand
2. how large is the dict - must it be really fast, do you need to optimize the filtering etc
3. do you really need a dict after all, maybe bruteforce for instance the first x chars (length 1-x, where x depends on the algo and could be up to 7-8) would be more clever -> then use the dict w/ length > x and < 8 for instance
Anyway, on linux you could do something like:
Code:
$ sort -u orig_dict.txt -o dict_unique.txt
$ grep -E '^[0-9a-zA-Z]{0,8}$' dict_unique.txt > less_than_8.txt
OR
Code:
$ < orig_dict.txt parallel --pipe --gnu grep -E '^[0-9a-zA-Z]\{0,8\}$' | sort -u -o less_than_8.txt
Note: there are many possibilities to speed up the filtering... for instance grep w/ LANG=C also often helps. Furthermore, you could also consider doing it w/ awk and other tools (if the dict is not too large, you could also unique the dict all w/ a simple awk, i.e. filter+sort w/ a awk one-liner , unfortunately this doesn't scale very well w/ larger dicts)
UPDATE: I think I got it wrong and you meant all passwords >= 8, if so then something like:
Code:
$ awk '$0~/^[0-9a-zA-Z]{8,}$/' dict_unique.txt > less_than_8.txt
OR
Code:
$ grep -E '^[0-9a-zA-Z]{8,}$' dict_unique.txt > less_than_8.txt
might work