oclHashcat v1.02 going for distributed cracking
#1
There have been a lot of different 3rd party approaches for distributed cracking with oclHashcat. The basic idea is easy. As in all parallel computing environments you need to find a way to split the work across your set of worker nodes.

At this time, the following ideas have been developed:

- Split the dictionary into N pieces, distribute the pieces to worker nodes
- Split the rules into N pieces, distribute the pieces to worker nodes
- Split the mask into N pieces, distribute the pieces to worker nodes
- Create offsets in .restore files and distribute the restore files to worker nodes

They all work, but they are all more or less suboptimal. But that's just because oclHashcat was lacking a specific feature that users need to make it easier, faster and overall better.

What we added are just two parameters: -s and -l. They are all you need to integrate oclHashcat into your favourite distributing system like boinc or your own one. And it's simple. If you are familar with hashcat stuff you know these parameters already since hashcat CPU, maskprocessor and statsprocessor have them, too.

The parameters stand for (s)kip and (l)imit. It's just that two integers. You can define a range of any size inside your keyspace. With -s you set the offset and with -l you set the length of the range inside your keyspace.

Lets do an easy math example. You have a dictionary of 1000 words and 4 worker nodes. They all have the same speed.

You simply divide 1000 by 4 and get 250. This is the length and it's the same value for all nodes because they all have the same speed. Your commandline will look like as follows:

Code:
PC1: ./oclHashcat64.bin -s   0 -l 250 ... // computes   0 - 249
PC2: ./oclHashcat64.bin -s 250 -l 250 ... // computes 250 - 499
PC3: ./oclHashcat64.bin -s 500 -l 250 ... // computes 500 - 749
PC4: ./oclHashcat64.bin -s 750 -l 250 ... // computes 750 - 999

But where did you get the 1000 from? Well, it's just the number of words in your dictionary. But it's a bit more complicated to calculate this when dealing with more advanced attack modes. Therefore we added another parameter: --keyspace. It tells you the keyspace you need to know. For brute-force, for instance, keyspaces are generated dynamically depending on mask and algorithm used and that's why you should use --keyspace instead of trying to calculate it yourself.

Code:
root@sf:~/oclHashcat-1.02# ./oclHashcat64.bin Verified.list.txt -m 2611 -a 3 ?d?d?d?d?d?d?d?d?d --keyspace
1000000
root@sf:~/oclHashcat-1.02# ./oclHashcat64.bin Verified.list.txt -m 2611 -a 3 ?d?d?d?d?d?d?d?d --keyspace  
100000
root@sf:~/oclHashcat-1.02# ./oclHashcat64.bin Verified.list.txt -m 2611 -a 3 ?d?d?d?d?d?d?d --keyspace  
10000
root@sf:~/oclHashcat-1.02# ./oclHashcat64.bin Verified.list.txt -m 2611 -a 3 ?d?d?d?d?d?d --keyspace  
10000

If you take a close look to the last commandline you will see the special behavior. oclHashcat automatically reorganize masks to make them as performant as possible. So make sure, for each attack you do to run --keyspace regardless of the attack-mode.

Sometimes you have a bit more complicated environment. Like if the nodes are not all of the same speed. Or even more complicated, you have nodes that switch off and on at random times or you want to be able to put in new nodes or take out nodes while the attack is running. This is usually a more production-like system. To do this, you should do a different approach. Again, you know your total keyspace is 1000 but you don't divide by 4 because that number changes all the time. To not make it to complicated, you can assign each node a fixed workload of let's say -l 100. Your master node just tracks the -s value.

Code:
keyspace=1000
limit=100

for (skip = 0; skip < keyspace; skip += limit)
{
  PCxxxx: ./oclHashcat64.bin -s skip -l limit
}

What we were trying to explain is that all you need are just those two parameter to enable distributed computing, even in more complicated environments.

A beta version of oclHashcat v1.02 is available in /beta for beta-testers. Have fun playing. For all other users, just wait for v1.02 release. This feature will be inside, guaranteed!

--
atom
#2
(01-19-2014, 12:37 AM)atom Wrote: ...
A beta version of oclHashcat v1.02 is available in /beta for beta-testers. Have fun playing. For all other users, just wait for v1.02 release. This feature will be inside, guaranteed!

--
atom

Very cool atom! *VERY COOL* :-D
#3
Thx atom.
This is a very amazing feature.

I want to add 1 technical detail:
whenever you use -s / -l in combination w/ maskfiles / directories... the offsets / values will apply to the first mask / file ... It currently won't skip an entire file and "overflow" to the next one if -s is larger than the first mask / file....
I think this is good this way to avoid confusion and unwanted behaviour...
Just wanted to add this as note... s.t. everyone knows / can understand / argue about it...

thx again ... great job!!!
#4
Specifics on the underlying protocol and authentication mechanisms, please.

There appears to be two assumptions with the -s/-l mechanics:

1) The password lists are relatively small; and,
2) Data flow (i.e. network usage) is relatively minimal.

Without documentation on the mechanics of -s and -l then I have to ask: How skipping is achieved? Assuming the words aren't sorted and indexed, I assume skipping is achieved by reading the list because you don't know where to seek(). On a 1TB file, for example, that could mean most of the file would have to be read before submission to the last GPU, resulting in two problems:

1) Ramp-up is a slow linear step (i.e., node start up, network xfr contention, etc.); and,
2) It may be possible that GPU set #1 is long finished before GPU set #7 is started.

It's not clear to me from the posting whether a central node is distributing chunks of work or you're running oclHashcat manually on each node. The later is not good because a 100MB list across ten equal GPU sets means a rough network bandwidth consumption of 500MB.

My use cases are probably different that others. In my case, I have precomputed password lists whereas others are probably running rules against small lists. In my case I have >50TB of data.
#5
(01-19-2014, 02:46 AM)dglatting Wrote: Specifics on the underlying protocol and authentication mechanisms, please.

there is no protocol, nor authentication mechanisms. this is not a solution for distributed cracking. it just adds a few features to hashcat which make it easier to integrate into distributed frameworks.

(01-19-2014, 02:46 AM)dglatting Wrote: My use cases are probably different that others. In my case, I have precomputed password lists whereas others are probably running rules against small lists. In my case I have >50TB of data.

unless you are working with very slow algorithms, you cannot achieve full acceleration with wordlists, regardless of their size. rules are necessary when cracking fast hashes if you have any hopes of achieving full acceleration. otherwise you might as well just be using cpu.