Combinator Attack and unique
#1
Hi,
When using Combinator attack (with two wordlists), hashcat does not unique duplicate words.
I think it's useful, but it may exist a reason why developers did not implement that ?
Thanks.
#2
The assumption is that the user will deduplicate, if desired.
~
#3
From an algorithm perspective, would it be faster to sort & uniq on CPU or GPU, for huge files ?
#4
CPU, I would think.

Here's an alias that I stole from epixoip that works well on Linux:

Code:
   bigsort() { LC_ALL=C sort --parallel=4 -S 4000M -T /path/to/fast/storage/ $*; }

Adjust 'parallel' and '-S' option sizes to taste, based on your number of cores, RAM, etc.

This assumes that you don't need to preserve order, and it's only one file.  If you're managing a library of wordlists, for which order is significant (most likely words first), I've had good luck using rli/rli2 from hashcat-utils to remove duplicates among files.
~