Extracting the passwords from a multiple file wordlist (sed & grep).
#1
Hello everybody. Lets say you "hypothetically" encounter a wordlist that not only is made up of many files, but ("hypothetically") contains a lot of other information about users (lets say it's a "hypothetical" leak). As a good guy, you don't need all that info, actually you don't want that info. It would have been great if the list was already cleaned up and the passwords extracted.

I decided to learn some grep & sed, so this seemed like a great way to get started with those tools. Here's how you could extract all the passwords and clean up the file.

Extract all files in a directory. You will have to identify something unique about the lines containing the passes you want to extract. Lets say it looks something like:

Code:
username=blabla
pass=extract_me
email=really@bad.com
comment=dont share lists containing user emails!

and this goes on forever. Here we'll extract every line containing "pass=" from every file in the folder "extracted_directory" using grep. Note that you need to be one level 'up' from that directory in bash (or your terminal).

Code:
grep -rhi 'pass=' extracted_directory/ > wordlist_merged.txt

Isn't grep awesome Smile Plus you are being legit and not looking at personal stuff. Now we have to remove "pass=" from the beginning of every line from that file. We can use sed:

Code:
sed 's/pass=//g' wordlist_merged.txt > wordlist_cleaned.txt

Let's remove leading and trailing whitespaces:

Code:
cat wordlist_cleaned.txt | sed 's/^[ ]*//;s/[ ]*$//' > wordlist_whatever.txt

Now you can remove duplicates and sort the file starting with the most used password:

Code:
cat wordlist_whatever.txt | sort | uniq -c | sort -nr > wordlist_sorted.txt

This results in a list with numbers in front of every password. We want to remove those using sed:

Code:
cat wordlist_sorted.txt | sed 's/^[ ]*[1234567890]*[ ]//' > wordlist_FINAL.txt

And there you go! A nicely sorted and cleaned list Smile

Alert: Always back-up your lists before doing any of this!
Have a great day.
Reply


Messages In This Thread
Extracting the passwords from a multiple file wordlist (sed & grep). - by Socapex - 06-14-2012, 05:45 PM