09-19-2024, 05:12 AM
(09-18-2024, 05:15 PM)Snoopy Wrote:(09-18-2024, 10:26 AM)mima8cn Wrote: The large dictionary I downloaded online is about 150GB. How can I filter out my existing dictionary files? My computer has 80GB of memory, and when using rli.exe files that exceed 20GB, it reports insufficient memory. Is there any other way to filter out duplicate content in other files!
The operating system is Windows
In addition, there is a single file of 150GB. Is there any way to filter out duplicate content in a single file?
depending on the amount of passwords / size of your dictionaries and the attacked hashtype i would assume to just leave it this way
when attacking fast hashes like NTLM or MD5 small dictionaries will be processed almost instantly, therefore filtering out your dictionary files would take more time then just hashing them again
anyway, you could utilize the windows subshell for linux and tools like sort and comm but for this, but you need to sort your input beforehand, so this will also take some time to prepare all of your input files, not quite sure whether sort can handle files that big or not
jfyi
big.txt (after sort)
Code:1
10
2
3
4
5
6
7
8
9
smalltxt
Code:3
5
7
Code:comm -23 big.txt small.txt > unig-big.txt
would result in uniq lines big.txt minus small.txt
Code:1
10
2
4
6
8
9
This command is for Linux systems, and my system is Windows. I want to know how Windows can filter out duplicate content and files
8