Posts: 4
Threads: 1
Joined: May 2013
Hi!
Is there any option to save calculated hashes from dictionary to another file?
Can i just hash every line from dict and save it?
Im buliding small database with plain:md5
ha1 and hashing it with php would take few years
Posts: 179
Threads: 13
Joined: Dec 2012
05-19-2013, 01:34 PM
(This post was last modified: 05-19-2013, 02:32 PM by Kuci.)
You can use this batch script:
Code:
while read line; do echo -n "$line" | md5sum | cut -c -32; done < [yourfile]
Posts: 4
Threads: 1
Joined: May 2013
Yeah, thanks for reply.
I can do it with batch or php, but i want accelerate that process with gpu.
Posts: 2,301
Threads: 11
Joined: Jul 2010
you'd have to code that yourself.
Posts: 179
Threads: 13
Joined: Dec 2012
Well, as undeath wrote, you can code it yourself. Or try to contact atom and tell him about your idea.
For now, batch should be faster than PHP
Posts: 15
Threads: 0
Joined: Sep 2012
You need to write a program that will hash using SSE2/SSSE3/AVX to make it fast. You should write truncated binary hashes to the file. Since I guess you want to make it useful, you should make a lossy hash table. Indexing around log2(number of passwords) bits of the hash and save a binary value that represents a range of passwords... Unless you are just using an unmodified dictionary then you'll need to just save the full password. Well I guess you could split the dictionary into blocks and compress each of them. Then the "binary value that represents a range of passwords" would be which compressed dictionary part to look at.
Anyway it's not like I've done similar things...
http://www.tobtu.com/md5.php. It took about 3 days to generate a database with 50 billion passwords and this was kinda slow. So far this is the best I've come up with.
You can look at this:
http://en.wikipedia.org/wiki/Wikipedia_t...ano_coding I really need to go find those papers and read them. So I can add sources and get the article accepted.
I need:
"On binary representations of monotone sequences" by Peter Elias in 1972
I have (but I haven't fully read it yet but the intro was so long and boring that I figured out that in the case of lossy hash tables
Huffman encoding of the number of passwords per bin (number of bins is on the order of number of passwords) is more efficient but this leads to an unknown size instead of a size that is known before you start with Elias-Fano):
"Efficient Storage and Retrieval by Content and Address of Static Files" by Peter Elias in 1974
I found (but this doesn't describe anything about the Elias-Fano coding so you "don't" need to read it):
Robert M. Fano. On the number of bits required to implement an associative memory. Memorandum 61, Computer Structures Group, Project MAC, MIT, Cambridge, Mass., n.d., 1971
Posts: 4
Threads: 1
Joined: May 2013
Thanks, that is really useful.
I have some work to do...
Posts: 30
Threads: 7
Joined: Dec 2012
A couple of other minor points, and a question: Why are you storing this in a database?
The minor points; when dealing with large amounts of data, it's important to remember that disk I/O isn't free :-)
On my system, the stock MD5() function in libcrypto runs at 5 million hashes per second, on a single thread. SHA1 is a bit slower, at 2.5 million hashes per second. Both assumed a typical 8-charcter password (longer passwords take longer, of course).
That same system can read a file at about 11 million words per second (again, assuming 8 character passwords, plus a linefeed). It can write a little bit slower.
So, if you just want to create a list of
word:MD5:SHA1
you will need to read ~9 characters, and write ~9+33+41 for each line.
That means you will be able to write about 1 million lines per second, before you run out of disk bandwidth.
In other words, the stock MD5 and SHA1 functions are plenty fast enough to run your disk to saturation, if you have standard hard drives. If you have an SSD, you might need to use one or two threads to bring your speed up.
Using a GPU won't help (at all) in this application.
Posts: 4
Threads: 1
Joined: May 2013
Thanks.
I will hash it slower than thought, but this isnt a problem.
I can hash 170 milion words to md5 and save it to csv in just hour