hashcat Forum
How to efficiently manage huge (>100 GB) wordlists? - Printable Version

+- hashcat Forum (https://hashcat.net/forum)
+-- Forum: Deprecated; Previous versions (https://hashcat.net/forum/forum-29.html)
+--- Forum: General Help (https://hashcat.net/forum/forum-8.html)
+--- Thread: How to efficiently manage huge (>100 GB) wordlists? (/thread-3483.html)



How to efficiently manage huge (>100 GB) wordlists? - questme - 06-18-2014

Heya,

For my special use case brute-force doesn't work as good as a wordlist. My list is dozens of GB already and every now and then I add new lists of a few gigs to the old list and do a simple "sort -u oldlist.txt > newlist.txt" to remove the duplicates.

Hashcat works great with such big lists, but managing the list (adding new entries without storing all the duplicates) is a pain and takes a lot of time.

Are there some best practices to manage wordlists of this size? Maybe using a NoSQL-DB like LevelDB?


RE: How to efficiently manage huge (>100 GB) wordlists? - Kgx Pnqvhm - 06-18-2014

ULM may help: http://unifiedlm.com/Home

Also, someone on the hashkiller forum is working on something more powerful:
http://forum.hashkiller.co.uk/topic-view.aspx?t=5512&m=37742#37742


RE: How to efficiently manage huge (>100 GB) wordlists? - undeath - 06-19-2014

ULM isn't suitable for such big collections.

Adding new entries to your dict however can be done way faster:

sort newdict -o newdict && sort -m -u olddict newdict -o mergeddict


RE: How to efficiently manage huge (>100 GB) wordlists? - Kgx Pnqvhm - 06-20-2014

How does the MST (Multiple Sort-Tools) on SmallUtilities.org (related to Hashes.org) compare?