Wrong number of total hashes ?

I have downloaded a list of hashes at http://dazzlepod.com/site_media/txt/hashes.txt, which contains 551638 md5 hashes.

Quote:cat md5.txt|sort|uniq|wc -l

Hashcat also sees 551638 ("Hashes: 551638 total").
But at the end, it says :

Recovered......: 83818/426429 (19.66%) Digests, 0/1 (0.00%) Salts

Where does "426429" come from ?
I would be waiting for 83818/551638, no ?

Thank you.

Quote:oclHashcat-plus-0.14\oclHashcat-plus64.exe -m 0 --username -o res.txt md5.txt passwords.txt
oclHashcat-plus v0.14 by atom starting...

Hashes: 551638 total, 1 unique salts, 426429 unique digests
Bitmaps: 21 bits, 1048576 entries, 0x000fffff mask, 4194304 bytes
Rules: 1
Workload: 1024 loops, 32 accel
Watchdog: Temperature abort trigger disabled
Watchdog: Temperature retain trigger disabled
Device #1: Cypress, 1024MB, 850Mhz, 20MCU
Device #1: Kernel C:\H\oclHashcat-plus-0.14/kernels/4098/m0000_a0.Cypress_1084.4_1084.4 (VM).kernel (1109768 bytes)
Cache-hit dictionary stats passwords.txt: 20163421 bytes, 2144233 words, 2144233 keyspace

Session.Name...: oclHashcat-plus
Status.........: Exhausted
Input.Mode.....: File (passwords.txt)
Hash.Target....: File (md5.txt)
Hash.Type......: MD5
Time.Started...: Fri Mar 29 11:51:04 2013 (22 secs)
Time.Estimated.: 0 secs
Speed.GPU.#1...: 105.1k/s
Recovered......: 83818/426429 (19.66%) Digests, 0/1 (0.00%) Salts
Progress.......: 2144233/2144233 (100.00%)
Rejected.......: 0/2144233 (0.00%)
(if needed, passwords.txt can be foudn here http://dazzlepod.com/site_media/txt/passwords.txt)
Are you sure about the format of the hash.txt list.
It seems to be number[8 digit]:hash[md5 32byte]
Therefore, a command like:
cut -b 8- hashes.txt|sort -u|wc -l
gives a MUCH smaller number than your 551638... so the hashes are *not* uniq (only the lines, since "numbered")

Furthermore, the answer is already in your question: *426429 unique digests*

Ps. it would be better to split the two parts w/ cut -d ":" -f 2
and test also cut .....|uniq -d which gives a lot of duplicates.

BTW (the output of the split):
$ cut -d: -f2 hashes.txt|sort -u|wc -l
Hope this solves your problem
Oh yes I confirm thanks Smile

cat hashes.txt|cut -d: -f2|sort|wc -l
cat hashes.txt|cut -d: -f2|sort|uniq|wc -l

(I wonder why the owner of the list did not do that, anyway.)

Thread closed Smile
Good to know that it worked!

BTW, I know that we all love *cat here but there are times you shouldn't use the cat commands...hehe
Why are people using cat to grep stuff, cat to sort stuff, cat to cut stuff etc... it needs to create another pipe etc etc etc. You can definitely skip it!
You're right, it's a bad habit
Quote:cat md5.txt|sort|uniq|wc -l

i will stab you.

Quote:sort -u md5.txt | wc -l
That command(s) do *not* solve the problem that we had: unique lines vs unique hashes! (we need to eliminate the strings before the colon)

BTW, nobody would use an extra pipe between *TWO* commands to count just unique lines if you don't need to, e.g:
$ awk '{i[$0]++}END{print length(i)}' hashes.txt
$ # or many others w/o pipe!
it solves the problem of stringing together a ton of redundant commands.

that awk one-liner is hideous.