Dump Scraper - Printable Version +- hashcat Forum (https://hashcat.net/forum) +-- Forum: Misc (https://hashcat.net/forum/forum-15.html) +--- Forum: User Contributions (https://hashcat.net/forum/forum-25.html) +--- Thread: Dump Scraper (/thread-4196.html) |
RE: Dump Scraper - vladimir125 - 03-20-2015 ARGH!!!! Copy&paste strikes again RE: Dump Scraper - winxp5421 - 03-21-2015 I figured it was something like that Is there a way to stop scrape.php after a certain date? similar to organize.php Thanks for your hard work Vladimir this is awesome! RE: Dump Scraper - kartan - 03-21-2015 I have it up and running, first results are pretty good. Way better then expected. However to actually get it running it is a pain, here are my notes: apt-get insatll php5-curl curl -sS https://getcomposer.org/installer | php mv settings-dist.json settings.json (insert twitter api keys) #### dumper-scaper is working now ##### python dumpmon-scraper.py -s 2015-03-19 ####################################### apt-get install python-numpy python-scipy python-matplotlib ipython ipython-notebook python-pandas python-sympy python-nose python-sklearn #pip install scipy mkdir train mkdir data/raw/training #### manually sort out 20 hash/plain/trash into data/raw/training/[hash/plain/trash] #### php organize.php --train #### python classify.py #### php organize.php -s 2015-03-05 -u 2015-03-20 #### php extract.php -s 2015-03-05 -u 2015-03-20 #### find data/processed/plain/ -name "*.txt" -exec cat {} \; | sort -u > pastebin.dict.txt RE: Dump Scraper - vladimir125 - 03-21-2015 I know setting it up is a pain in the back, I'm working on creating a single executable file, so you won't have to import all the dependencies. Sadly I'm having some troubles with Twitter library, I hope I can fix it. If there is any Python developer, he would be very appreciated FYI I'm going to drop the idea of a GUI: since I want to create a cross-platform application, it would be a lot of trouble for such a little improvement. Any ideas and suggestions are more than welcome! RE: Dump Scraper - kartan - 03-21-2015 I don't really care much about a gui, the current setup is actually fine. However it needs a lot more polishing like sanity checks and prequesite checks. Also I am doing something that will streamline the learning process a bit better. Why is the training dir in data/raw/training/ not /data/training/ ? RE: Dump Scraper - kartan - 03-21-2015 made trainer.py and commited it to your repo RE: Dump Scraper - vladimir125 - 04-02-2015 Hello guys I have just compiled the win/linux binaries, can you please try them? https://github.com/tampe125/dump-scraper/releases/tag/v0.1.0-alpha now there is a single entry point, you simply have to type: dumpscraper [command] [options] available commands are: scrape (twitter scraping) classify -s [since] -u [until] (calculate the score and organize dumps) extract -s [since] -u [until] (extract useful info) the training part has been improved: training -d/--getdata will display an interactive way to manually classify dumps training -s/--getscore calculate the score for training data there are no backwards incompatibilities, so you can just keep all the previous dumps. I compiled them in Ubuntu 14.04 and Windows 7 32 bit, let me know if it works. Please be aware that this is my first time compiling Python, so things could simply be broken RE: Dump Scraper - dermoeter - 04-02-2015 Windows 7 x64 Ultimate Working fine so far. The hardest part was the Twitter App RE: Dump Scraper - lynx - 06-27-2015 I followed the guide to set up the Twitter App & configuring the scraper, but it always throws me an error: "Twitter error: Could not authenticate you." RE: Dump Scraper - coolbry95 - 06-27-2015 Dumpmon, the twitter account, has been suspended that may be why. |