Dump Scraper - Printable Version +- hashcat Forum (https://hashcat.net/forum) +-- Forum: Misc (https://hashcat.net/forum/forum-15.html) +--- Forum: User Contributions (https://hashcat.net/forum/forum-25.html) +--- Thread: Dump Scraper (/thread-4196.html) |
Dump Scraper - vladimir125 - 03-19-2015 As you already know, Internet is full of passwords (plain and hashed ones): when a leak occurs, usually it's posted to PasteBin. The pace of these dumps is so high that it's not humanly possible to collect them all, so we have to rely on a bot, scraping PasteBin site for interesting filea. Dump Monitor will exactly do this: every time some leaked information are posted on PasteBin, he will tweet the link. Sadly Dump Monitor is not very efficient: inside its tweets you will find a lot of "false positives" (debug data, log files, Antivirus scan results) or stuff we're not interested into (RSA private keys, API keys, list of email addresses). Moreover, once you have the raw data you need to extract such information and remove all the garbage. That's the reason why Dump Scraper was born: inside this repository you will find several scripts to fetch the latest tweets from Dump Monitor, analyze them (discarding useless files) and extract the hashes or the passwords. https://github.com/tampe125/dump-scraper/releases Please remember to read the wiki before continuing: https://github.com/tampe125/dump-scraper/wiki Finally, this is a super-alpha release, so things may be broken or not working as expected. Moreover, I know it's a kind of "hackish": a single program with a GUI would be 100 times better. Sadly I'm running out of time and I don't know anything about Python GUI development: if anyone wants to contribute, it would be more than welcome! Please leave here your thoughts and opinions. RE: Dump Scraper - atom - 03-19-2015 Many thanks! RE: Dump Scraper - Si2006 - 03-19-2015 Can get it to work on ubuntu, I filled in the twitter auth keys and renamed the settings-dist.json and installed dependences. PHP 5.5.22-1+deb.sury.org~precise+1 | Python 2.7.3 php scrape.php PHP Warning: require_once(vendor/autoload.php): failed to open stream: No such file or directory in /home/xxxxx/dump-scraper/scrape.php on line 8 PHP Fatal error: require_once(): Failed opening required 'vendor/autoload.php' (include_path='.:/usr/share/php:/usr/share/pear') in /home/xxxxx/dump-scraper/scrape.php on line 8 RE: Dump Scraper - vladimir125 - 03-19-2015 ah crap, I forgot to put that in the wiki! You have to get composer (https://getcomposer.org/download/) and run: php composer.phar install sigh, that's the risk of always working on a dev environment... However don't worry, if everything goes smooth, I think I'll release a new Python only version, with a single entry point. RE: Dump Scraper - Si2006 - 03-19-2015 That done the trick! Thanks RE: Dump Scraper - Si2006 - 03-19-2015 One more problem. Doesn't seem to create the data folder after processing the tweets with "php scrape.php" and is also displaying a php notice in term. Code: processed 2000 tweets RE: Dump Scraper - vladimir125 - 03-19-2015 ignore the notice error, it seems the tweet doesn't have any data (I'll add a check for it). Please manually create the folder data/raw Tomorrow I'll release a new version addressing these issues... RE: Dump Scraper - winxp5421 - 03-20-2015 Got everything working up till "python classify.py" running Ubuntu 14.04, python 2.7, scipy'0.13.3' , sklearn'0.15.2'. Error: http://pastebin.com/e2QMSmKs RE: Dump Scraper - vladimir125 - 03-20-2015 can you please post the training/features.csv file? I think there are some invalid values inside that. You can upload it to pastebin and put the link here. Thank you very much! RE: Dump Scraper - winxp5421 - 03-20-2015 After you had mentioned that the training csv had invalid information i re looked at the Wiki and noticed that the training folder structure was "train" instead of the more logical "trash" i had a hunch this was a typeo so made the adjustment and all works fine now thanks! http://prntscr.com/6j5lck |