Would I be at an advantage to take the several dozen wordlists available online, combine them together and remove duplicates and whitespaces, etc... Looks like the resulting dictionary file would be close to 200gb.
Giant deduplicated wordlists are great for fast hashes, but not as helpful for slow ones.
And what's usually missing from most of them is *frequency data*.
A giant wordlist, sorted by how common those words are as passwords or as the base of a password, would be the sweet spot.