--show step too slow using large file
#1
Hi

I am trying to crack a 100 million lines md5 based hash list. 

I started by spliting in 10 million files.

Then i ran:


./hashcat64.bin -a 3 -m  0  --potfile-path ./md5.pot --username   <nameofthefilewithhashes> ?l?l?l?l?l?l ?l?l 

that took several hours to complete and cracked 70% of passwords, i can see that the md5.pot file is now populated (around 200Mbytes). Then i run:

./hashcat64.bin -a 3 -m  0  --potfile-path ./md5.pot -o <nameofoutputfile> --outfile-format 2 --username   <nameofthefilewithhashes>  --show

This is taking forever to complete (more than 24 hours already) is this expected? 

I did a test with a 10.000 hashes list and it took more than 3 minutes to create the  file using --show. based on that, the 10 million file would take more than 2 days...is this expected behavior? Anything that can be done? I would expect the crack to take longer than to match each cracked password to the original username...


Many thanks in advance

Octan
#2
So...72 hours already and still running. One CPU at 100% so i guess is still on it...any hints? Many thanks in advance
#3
The problem here is the --username switch. This causes hashcat to allocate a large memory block for each hash. If you have 100 million, it will take years. So the solution would be to not use --username and merge it afterwards yourself.
#4
(05-07-2017, 08:07 PM)atom Wrote: The problem here is the --username switch. This causes hashcat to allocate a large memory block for each hash. If you have 100 million, it will take years. So the solution would be to not use --username and merge it afterwards yourself.

Great, many thanks for your answer... i ended up cancelling it after 4 days. I think i will do something like this after finishing the cracking exercise:


1.- Sort the username-hash list by hash
2.- Sort the hash - password pot file by hash
3 .- Create a new file by reading each line in  1 and finding it in 2  unless the line hash value is equal to the previous line one.

Let see how long that takes, probably a long time as well as it would have to go over the pot file again and again. I will work on how to improve the algorithm and will post the results. Many thanks again