Facebook's directory
#1
Before continue reading, grab the torrent and start downloading.
magnet link:
Code:
magnet:?xt=urn:btih:715e5820f10df054cff37f791bfbd4a29504598d&dn=fbdir&tr=udp://open.demonii.com:1337&tr=udp://tracker.coppersurfer.tk:6969&tr=udp://tracker.leechers-paradise.org:6969&tr=udp://exodus.desync.com:6969

Facebook keeps a public directory [1] of every account which hasn't opted out (somewhere in the settings).
I don't own a Facebook account so I can't tell you where to change this setting.
If that sounds familiar to you it might be because Ron Bowes crawled this directory a couple of years ago [2]
However it got a log bigger since then, so I decided to get a copy of it myself.
It took about two weeks and a lot of bandwidth to accomplish that therefore I'm not recomending you to do it youself.
That's why I'm sharing it here in the first place.

You'll find two compressed archives in the torrent.
raw.tar.xz contains all data I've crawled
Here's an example of Mark Zuckerberg's (yes he's in there) data to get an idea of the format.
Code:
zuck;Mark Zuckerberg
"zuck" is the Facebook alias, username or whatever they call it.
"Mark Zuckerberg" is of course the name.
If no alias is set then the first column is the ID of the account.

processed.tar.xz
First I used names from the names and names non-latin directory which only had one space.
Then I split them into first and lastnames, counted the occurrence and finnaly sorted them.
The result looks like this:
Code:
2624503 david
2381129 maria
2258886 john
2174816 daniel
1980513 michael
1825282 alex
1692624 ali
1685169 ahmed
1663437 carlos
1663132 mohamed

Why the heck did I post it here?
It's common that users choose their passwords with a name as base and add numbers, etc.
So get the names, add some rules and have fun.

[1] https://www.facebook.com/directory
[2] https://blog.skullsecurity.org/2010/retu...-snatchers


Attached Files
.zip   fbdir.zip (Size: 83.57 KB / Downloads: 203)
#2
VERY nice think. I also want do somethink like that but as You say its a lot o bandwith
TY very mutch
#3
Interesting, thanks for this. Should be good for making wordlists as you pointed out.
#4
thx dl it and check it out later , i M gespannt.
#5
Bout time this was re-done.

Well done that man.
#6
Thanks for the feedback and I'm glad you like it!
ati6990: you're willkommen Wink

All I did was doing a lot of HTTP GET requests on a public part of facebook.com, parsing the result and store them into files.
I want to make that clear before anyone (not from this community) is accusing me of "hacking" Facebook.
I'll write more about it in a later post.

Take a look at the picture in the attachment and guess when atom tweeted the link to this post. (that's megaBYTES per second)
Whoever caused the spike: nicely done and I hope you'll seed the torrent as long as you can!

Back to the directory...
The crawling was done during the first two weeks in December 2014.
I've also got the data from the pages and places but they are only included in the raw.tar.xz archive since I didn't processed them further.
The latin names where converted to lowercase before the sorting.
I didn't do that for the nonlatin names because I wasn't sure if it would break the UTF-8 characters.
If anyone knows more about it, please post it here.

Here are some commands to deal with the dataset.
For Windows users there is cygwin but I recommend you to take a look at a unix based OS.

Get rid of the count in the processed files (so you can use it as a dictionary)
Code:
$ cut -b9- first_names.txt > first_names.dic

Get a list of usernames and exclude the ones which are IDs
Code:
$ cut -d";" -f1 names/fbdata.* | awk '! /^[0-9]+$/' > usernames.txt
Repeat that for the non-latin, pages and places. The output should already be unique.


Attached Files
.png   traffic.png (Size: 11.31 KB / Downloads: 69)
#7
looks like you try ddos fb ;-__) great work btw
#8
Done downloading, I'll seed for a wee while.
#9
Blandy put together some Unix tools.

http://home.btconnect.com/md5decrypter/unix-utils.zip
#10
processed.tar.xz looks like it was made to be paired with T0XlC-insert_00-99_1950-2050_toprules_0_F.rule.