best64.rule contest
#11
rockyou for sure, but linkedin was uniq'd so it's no good for this purpose.
#12
Yeah uniq the hashlist wasn't good. Btw, there is also "10-million-combos.txt" from Mark Burnett. I've replaced many of my "rockyou.txt" tasks with this list already.

+ real people passwords
+ made for research
+ nearly same size as rockyou.txt
+ from different sites not just one
+ not gaming sites, so maybe more serious passwords
- not a leak, cracked passwords
#13
Have you taken into account any of the issues cited in "A list of flaws in the data set_10millionpasswords" at
https://www.reddit.com/r/10millionpasswo..._data_set/
#14
No that's actually new to me, thanks! I've gone through the list and pulled out what could be a problem if we would use it for a contest:

- used cleanup scripts (don't this to your wordlists unless you really know what you do)
- email addresses
- default passwords tend to skew lists
- weighted criteria
- hashes in wordlist
#15
I'm not sure if uniq'ed wordlists pose a problem for this contest (linkedin). I guess the difference between total number of cracked passwords vs unique cracked passwords is relatively small because commonly used passwords usually follow weak rules (or none at all). Contrary, non-uniq'd lists might push up random spam bot passwords.
#16
The list from the previous contest was unique on purpose.

There are two reasons why you find duplicates in any dump: Simple passwords, and site-specific passwords. Neither of which are useful to build a stronger ruleset.
#17
I strongly disagree, James. Duplicates are essential for sorting rules by probability. Just as you'd never generate an hcstat file with a wordlist that's been uniq'd. By removing duplicates you are skewing the stats.
#18
There is advantage and disadvantage in both variants.

It would be nice to add more people from the password cracking scene (like team-insidepro and jtr-users) for this contest, as everyone would benefit from it.

@mastercracker & @magnum You guys interessted?
#19
I would like to participate but don't really have the time. I will give it a shot if I have some spare time when you run the contest. If you want to make it a bit more challenging, you can make the contest about the best wordlist + rule combination. The winner being the one who will crack the most passwords using a maximum of x words and y rules. X could be around 0.5 to 3 million and Y around 50 to 500.
#20
This opens another question. Are the plaintext passwords for the hashes known or not.