hashcat
advanced password recovery

***atom*** · (This post was last modified: 09-07-2012, 05:05 PM by atom.)

We are proud to present oclHashcat-plus v0.09!

Download it here: http://hashcat.net/oclhashcat-plus/

Lots of new features and algorithms have been added, and many bugs have been fixed.

The major changes are:

Support for cracking the bcrypt and sha512crypt ($6$) algorithms.
Support for GPU clustering across multiple LAN hosts via VCL, and an increase to support 128 GPUs.
Added what we call a Brute-Force++ attack (see details for description).
Increased cracking performance, especially on multi-hash due to partially reversing as you know it from single-hash cracking.

Lets start with the algorithms added; in this case, the generic types:

added -m 10 = md5(pass.salt)
added -m 20 = md5(salt.pass)
added -m 30 = md5(unicode(pass).salt)
added -m 40 = md5(salt.unicode(pass))
added -m 110 = sha1(pass.salt)
added -m 120 = sha1(salt.pass)
added -m 130 = sha1(unicode(pass).salt)
added -m 140 = sha1(salt.unicode(pass))
added -m 1410 = sha256(pass.salt)
added -m 1420 = sha256(salt.pass)
added -m 1710 = sha512(pass.salt)
added -m 1720 = sha512(salt.pass)

They have been added for two reasons.

1. Because there were many requests by users to add them like here:

http://hashcat.net/forum/thread-1009.html
http://hashcat.net/forum/thread-1152.html
http://hashcat.net/forum/thread-1444.html
http://hashcat.net/forum/thread-474.html
http://hashcat.net/forum/thread-490.html
http://hashcat.net/forum/thread-574.html
http://hashcat.net/forum/thread-577.html
http://hashcat.net/forum/thread-651.html
http://hashcat.net/forum/thread-830.html
http://hashcat.net/forum/thread-833.html
http://hashcat.net/forum/thread-944.html
http://hashcat.net/forum/thread-951.html

2. By adding another feature -- that is, setting the minimum length for a salt to 0 -- you can construct your own hashing modes if you exploit the salt by putting some data into the calculation. Since we have support in oclHashcat-plus for --hex-salt, this will make your lives even easier.

Next one is the bcrypt algorithm.

Guys, there is not much to say. Just one thing: do not expect too much! This algorithm was designed to run extremly slow on GPUs. It is highly dependant on memory-lookups, and is both salted and iterated. On our hd6990, we can reach 4085/s. This isn't much, but it's still multiple times faster than on CPU.

Details here:

http://hashcat.net/forum/thread-1219.html
http://hashcat.net/forum/thread-302.html
http://hashcat.net/forum/thread-186.html

Another algorithm we added was the EPIserver algorithm. These are the hashes stored by the ASP.NET membership provider. For more detailed information about this, have a look here: http://hashcat.net/forum/thread-987.html

There are plans to rename this algorithm from EPIserver to something like "asp.net membership provider." For now we will stick to EPIserver, but we will certainly rename this in a later version.

There was already an interesting blog post about all this here, definitely a good read: http://www.troyhunt.com/2012/06/our-pass...othes.html

Last but at least, the most impressive addition is the sha512crypt algorithm, aka $6$, which is used in nearly all Linux distributions by default.

Like all crypt(3) algorithms, this is another algorithm which is designed to run slow; plus, it is based on sha512, which uses 64 bit integers. Today's AMD GPUs do not have support for native 64 bit bitwise arithmetics (except shifts), so this is another reason why this algorithm is slow.

Still, the speedup cracking sha512crypt on GPU versus CPU is much higher compared to bcrypt. My hd6990 gives an impressive 32519/s, which we are very proud of!

This algorithm was requested here:

http://hashcat.net/forum/thread-790.html
http://hashcat.net/forum/thread-736.html
http://hashcat.net/forum/thread-303.html

The partial reversing of hashes for multi-hash lists differs a bit from classic single-hash reversal, which you are already familiar with if you use oclHashcat-lite. For several reasons, it is not efficient to reverse all hashes that many steps back as in single-hash cracking, and thus we can not reach oclHashcat-lite speed. But, it can still be more efficient than just traditional early checks.

To visualize this, here made some graphs:

[Image: plus89_mh18.png]

You can see that the less hashes you have, the more efficient it is. The curves on Nvidia are a bit sharper.

Whenever you run brute force on multiple MD4, NTLM or MD5 hashes, oclHashcat-plus will use this partial reveral technique. In theory we can port this to salted hashes as well, but multi-hash on a salted hash is a bad idea. So for now, we stick to raw and reversable algorithms.

Another nice thing that came up lately is the Virtual OpenCL Cluster Platform (VCL) project. When thorsheim and epixoip informed us about this project in this post http://hashcat.net/forum/thread-1473.html it was totally not working with oclHashcat-*, nor any other OpenCL-based password cracker. But, we got in contact with the developers at MOSIX, and after some debugging and trace sessions, we were able to pinpoint the problems. MOSIX then released VCL version 1.15 which addressed these issues.

The overhead produced by the network agents is very low. This is one of the most important factors for a distributed solution. I made some stats on this here:

[Image: vcltable.png]

VCL is intended to be used on dedicated LANs or with High Speed Interconnects. I would not recommend clustering nodes over the Internet, as both latency and bandwidth would be an issue.

Development for VCL support is still in its infancy, but I've tested it with 22 GPUs and it worked well. Installing and configuring VCL is outside the scope of these release notes, but I plan to write a form post on this topic soon. However, there is no magic required to get VCL running on your own.

To better support VCL, we have increased the maximum number of GPUs from 16 to 128. We do not know for a fact if VCL can handle 128 GPUs, but it works with at least 22 GPUs.

Another nice thing about this is that it works around the 8-GPU limitation in AMD's drivers and Xorg. Since VCL does not require X to run, you can build giant GPU clusters this way.

Something that already was included in the newer versions of oclHashcate-lite is the support for markov-chains.

It does not matter if you do simple Brute-Force attack using -a 3 or you do a dictionary based Hybrid-Attack using either -a 6 or -a 7. This enhancement is automatically used EVERY time you use a mask.

A little background on this, as if you do not use oclHashcat-lite you might not know:

The markov-attack is a statistically based brute-force like attack, but instead of specifying a charset or a mask, we specify a file that was generated once in a previous step. It contains statistical information which is made out of an automated analysis of a given dictionary.

It can fully replace Brute-Force since it covers the full keyspace.

In Brute-Force Attack (or in Mask Attack) we can limit the keyspace by setting a smaller charset in order to reduce the attack-time. In Markov Attack we have something similar, the "threshold". All you do is to specify a number. The higher the number, the higher the threshold to add a new link between two characters on the two-level table on which the markov-attack is based on.

The background is not so important -- just remember that the lower the value, the smaller the keyspace, and thus the faster the attack is.

But if you take a close look on it, the technical correct description would be: "Brute-Force attack enhanced by per-position markov-chains built out of wordlists for statistics with the ability to use filters using a mask". OK? That required some special naming, and since it's 100% replacing Brute Force, we made it simple for ourselves and called it Brute-Force++

Here is a nice chart that visualizes the efficiency of Brute-Force++:

[Image: bfpp.png]

The original description of how this works can be found here:

http://hashcat.net/forum/thread-1291.html
http://hashcat.net/forum/thread-1285.html
http://hashcat.net/forum/thread-1265.html

Use .ptx ad .llvmir intermediate kernels - from oclHashcat-lite

The kernels are distributed in an "intermediate" format (aka IL). This format cannot be reversed to its original C code, but is still not a binary format that can be used for execution.

The JIT (just-in-time) compilers from both OpenCL and CUDA, which ship with the driver, compile the final bytecode out of the IL. This takes a few seconds per kernel, but this is a one-time operation as the bytecode is cached (CUDA does it automatically, OpenCL does not, but we add eda function that emulates CUDA's behavior.)

This has some nice advantages:

Not 32/64 bit specific
Less HDD space
Smaller .7z
Less problems with driver specific problems as we often see with Catalyst
There is no more need to release a new oclHashcat-* in case a new driver optimization has been added. Cached oclHashcat-* kernels are driver specific. If it recognizes a driver change, it will rebuild the bytecode from the IL, but using the new JIT from the new driver, resulting in driver-specific optimized bytecode.

Added Retaining GPU temperature - from oclHashcat-lite

When I started with oclHashcat-* Hardware mangement support, some people asked me for add support for fan-speed. For a long time I was not interessted in adding fan-speed code to oclHashcat-* since this is the job for the driver or some specialized controling software.

I did not change my mind completly on this, but still we have added some fan-speed controlling code. The new parameters are:

Code:
--gpu-temp-disable            Disable temperature and fanspeed readings and triggers

--gpu-temp-abort=NUM          Abort session if GPU temperature reaches NUM degrees celsius

--gpu-temp-retain=NUM         Try to retain GPU temperature at NUM degrees celsius (AMD only)

So what this does is, if the temperature configured with the new --gpu-temp-retain parameter is reached, it starts to increase the fan-speed by 1 percent each second. Thats all. In practice, this means is it enables you to enfore a very specific operating temperature for your GPUs.

Some notes:

--gpu-temp-disable you can completly disable all the temperature stuff.
--gpu-temp-retain currently only works for AMD.
--gpu-temp-abort parameter is just the renamed version of the old --gpu-watchdog.
Both parameters accept the 0 value which disables only this specific feature. This means you can step back to the old behavior by specifying --gpu-temp-retain 0.
The default for --gpu-temp-abort is still 90c.
The default for --gpu-temp-retain is 80c.

More implemented feature requestes on forum:

http://hashcat.net/forum/thread-1303.html - Increment-mode for Brute Force
http://hashcat.net/forum/thread-1065.html - OpenLDAP SSHA's Dynamic Base64 Parser
http://hashcat.net/forum/thread-1335.html - Implement command line rules for plus
http://hashcat.net/forum/thread-1263.html - Add Charset ?a
http://hashcat.net/forum/thread-1140.html - Hashcat Exit Statuses
http://hashcat.net/forum/thread-1043.html - Next Dictionary In Line

More implemented feature requestes on PM / IRC / Email:

Default-mask for -a 3 mode from oclHashcat-lite v0.10
Commandline switch --disable-potfile feature from hashcat v0.40

This new version has been tested by many beta testers on a wide variety of hardware and operating systems.

All new features were available to beta tester for several weeks. All we did for the last few weeks was perform both automated and manual tests of all features and algorithms, until all issues were 100% fixed.

We want to say a special thank-you to the following beta-testers for their massive support during development:

This is great proof of how the cracking community is working together, regardless of what team they are on.

Of course we want to say thanks to all the beta testers who helped finding bugs and suggesting things as well -- Thanks!

--
atom and matrix

Full changelog:

Code:
type: feature

file: kernels

desc: added -m 10 = md5(pass.salt)

type: feature

file: kernels

desc: added -m 20 = md5(salt.pass)

type: feature

file: kernels

desc: added -m 30 = md5(unicode(pass).salt)

type: feature

file: kernels

desc: added -m 40 = md5(salt.unicode(pass))

type: feature

file: kernels

desc: added -m 110 = sha1(pass.salt)

type: feature

file: kernels

desc: added -m 120 = sha1(salt.pass)

type: feature

file: kernels

desc: added -m 130 = sha1(unicode(pass).salt)

type: feature

file: kernels

desc: added -m 140 = sha1(salt.unicode(pass))

type: feature

file: kernels

desc: added -m 141 = EPiServer 6.x

cred: thorsheim

type: feature

file: kernels

desc: added -m 1410 = sha256(pass.salt)

type: feature

file: kernels

desc: added -m 1420 = sha256(salt.pass)

type: feature

file: kernels

desc: added -m 1710 = sha512(pass.salt)

type: feature

file: kernels

desc: added -m 1720 = sha512(salt.pass)

type: feature

file: kernels

desc: added -m 1800 = sha512crypt, SHA512(Unix)

type: feature

file: kernels

desc: added -m 3200 = bcrypt

type: feature

file: kernels

desc: removed -a 4 permutation attack (use rules and combinator-attack instead)

type: feature

file: kernels

desc: added reversing kernel for multihash MD5 if running in -a 3 mode and mask < length 9

type: feature

file: kernels

desc: added reversing kernel for multihash MD4 if running in -a 3 mode and mask < length 13

type: feature

file: kernels

desc: added reversing kernel for multihash NTLM if running in -a 3 mode and mask < length 9

type: feature

file: kernels

desc: on AMD, switched from .kernel to .llvmir to reduce diskspace

type: feature

file: kernels

desc: on NV, switched from .cubin to .ptx to reduce diskspace

type: feature

file: kernels

desc: added kernel cache to avoid unnecessary recompilation

cred: m4tr1x

type: feature

file: kernels

desc: brought back support for AMD hd4xxx GPUS due to .llvmir integration

type: feature

file: kernels

desc: optimized 0x80 handling; +3.6% speed in combinator- and hybrid-attack

type: feature

file: host programs

desc: added support for Virtual OpenCL (VCL) Cluster Platform VCL 1.15

cred: epixoip

type: feature

file: host programs

desc: added support for up to 128 GPUS

type: feature

file: host programs

desc: ported markov-attack from oclHashcat-lite v0.10

type: feature

file: host programs

desc: ported increment-mode from oclHashcat-lite v0.10

type: feature

file: host programs

desc: ported default-mask from oclHashcat-lite v0.10

type: feature

file: host programs

desc: ported -j and -k single rules from oclHashcat v0.27

type: feature

file: host programs

desc: allowed zero-length salts in the generic algorithms makes it more easy to exploit them

type: feature

file: host programs

desc: added next-dictionary-in-line feature to skip inefficient dictionaries on keypress

type: feature

file: host programs

desc: implemented base64 parser that would allow for dynamic salt lengths in nsldaps

type: feature

file: host programs

desc: worked around memory allocation limit, you can load twice as much hashes in multihash

type: driver

file: kernels

desc: added support for NVidia CUDA 5.0

type: driver

file: kernels

desc: added support for AMD APP SDK v2.7

type: driver

file: host programs

desc: added support for NVidia NVML library and got rid of nvidia-smi command

type: feature

file: host programs

desc: splitted --gpu-watchdog to --gpu-temp-disable and --gpu-temp-abort

type: feature

file: host programs

desc: added --gpu-temp-retain to try retain temperature at NUM degrees celsius

cred: m4tr1x

type: feature

file: host programs

desc: worked around AMD bug in clGetDeviceInfo() CL_DEVICE_MAX_CLOCK_FREQUENCY

cred: m4tr1x

type: change

file: host program

desc: updated exit status code, see status_codes.txt for details

cred: m4tr1x

type: feature

file: host programs

desc: backported --disable-potfile feature from hashcat v0.41

cred: m4tr1x

type: feature

file: host programs

desc: add ?a to built-in charsets as ?l?u?d?s

cred: m4tr1x

type: feature

file: host programs

desc: added fan-speeds to status display

type: bug

file: host programs

desc: fixed a bug in host program for WPA/WPA2 in -a 1, -a 6 and -a 7 mode

cred: bjorn

type: bug

file: kernels

desc: fixed a bug in kernel for WPA/WPA2 on AMD VLIW architecture leading to code not found

cred: DrGeek

type: change

file: contact.txt

desc: updated contact information (moved to freenode IRC)

M@LIK · (This post was last modified: 09-07-2012, 04:57 PM by M@LIK.)

Great work! Good job all!

kartan · (This post was last modified: 09-07-2012, 09:38 PM by kartan.)

fucking amazing! this would qualify for a major release!

forumhero · 09-07-2012, 10:12 PM

fantastic work, everyone!

***atom*** · 09-08-2012, 04:15 PM

As said in the release notes, here is the howto:

Building GPU-Clusters for oclHashcat with VCL v1.15: https://hashcat.net/wiki/doku.php?id=vcl_cluster_howto

mastercracker · 09-08-2012, 08:18 PM

(09-08-2012, 04:15 PM)atom Wrote: As said in the release notes, here is the howto:

Building GPU-Clusters for oclHashcat with VCL v1.15: https://hashcat.net/wiki/doku.php?id=vcl_cluster_howto

Good wiki. It's not mentioned but I guess that you are bound with the same limitation as the OCL version which is that you need the same cards on each machine or at least the cards using the same kernel, right?

***atom*** · 09-08-2012, 10:21 PM

Yes, right, while my prio 1 is to enable mixed gpu types for v0.10 Smile

Mem5 · 09-09-2012, 03:20 PM

Thanks ! great release as always !

forumhero · 09-10-2012, 06:15 PM

atom, just wanted to clarify. is the master node required to be on the same highspeed LAN or can it be on wireless?

***atom*** · 09-10-2012, 06:20 PM

Wireless LAN is a highspeed LAN, somewhat Smile

Should work, yes!

Login
Username/Email:
Password:	Lost Password?
	Remember me

hashcat advanced password recovery

hashcat
advanced password recovery