hashcat v4.0.0
#1


Welcome to hashcat 4.0.0 release!



This release deserved the 4.x.x major version increase because of a new major feature:

Added support to crack passwords and salts up to length 256

Internally, this change took a lot of effort - many months of work. The first step was to add an OpenSSL-style low-level hash interface with the typical HashInit(), HashUpdate() and HashFinal() functions. After that, every OpenCL kernel had to be rewritten from scratch using those functions. Adding the OpenSSL-style low-level hash functions also had the advantage that you can now add new kernels more easily to hashcat - but the disadvantage is that such kernels are slower than hand-optimized kernels.

The OpenCL kernels from 3.6.0 were all hand-optimized for performance. No worries - these kernels still exist, and can be explicitly requested with the new -O (optimized kernel) option. This configures hashcat to use the optimized OpenCL kernels, but at the cost of limited password length support (typically 32).

Added self-test functionality to detect broken OpenCL runtimes on startup

Another important missing feature in the previous hashcat version was the self-test on startup. Some (mostly older) OpenCL runtimes were somewhat buggy (thanks to NV and AMD) in ways that created non-working kernels. The problem was that the user didn't get any error message that clarified the reason for the problems. With this version, hashcat tries to crack a known hash on startup with a known password. Failing to crack a simple known hash is a bulletproof way to test whether your system is set up correctly.

Added hash-mode 2501 = WPA/WPA2 PMK

This mode was added to run precomputed PMK lists against a hccapx, like cowpatty did (genpmk). You still have to precompute the PMK. Please use wlangenpmk/wlangenpmkocl from hcxtools to do so.

Improved macOS support

The evil "abort trap 6" error is now handled in a different way. There is no more need to maintain many different OpenCL devices in the hashcat.hctune database.



Download here: https://hashcat.net/hashcat/



Features:
  • Added support to crack passwords and salts up to length 256
  • Added option --optimized-kernel-enable to use faster kernels but limit the maximum supported password- and salt-length
  • Added self-test functionality to detect broken OpenCL runtimes on startup
  • Added option --self-test-disable to disable self-test functionality on startup
  • Added option --wordlist-autohex-disable to disable the automatical conversion of $HEX[] words from the word list
  • Added option --example-hashes to show an example hash for each hash-mode
  • Removed option --weak-hash-check (zero-length password check) to increase startup time, it also causes many Trap 6 error on macOS


Algorithms:
  • Added hash-mode 2500 = WPA/WPA2 (SHA256-AES-CMAC)
  • Added hash-mode 2501 = WPA/WPA2 PMK


Bugs:
  • Fixed a buffer overflow in mangle_dupechar_last function
  • Fixed a calculation error in get_power() leading to errors of type "BUG pw_add()!!"
  • Fixed a memory problem that occured when the OpenCL folder was not found and e.g. the shared and session folder were the same
  • Fixed a missing barrier() call in the RACF OpenCL kernel
  • Fixed a missing salt length value in benchmark mode for SIP
  • Fixed an integer overflow in hash buffer size calculation
  • Fixed an integer overflow in innerloop_step and innerloop_cnt variables
  • Fixed an integer overflow in masks not skipped when loaded from file
  • Fixed an invalid optimization code in kernel 7700 depending on the input hash, causing the kernel to loop forever
  • Fixed an invalid progress value in status view if words from the base wordlist get rejected because of length
  • Fixed a parser error for mode -m 9820 = MS Office <= 2003 $3, SHA1 + RC4, collider #2
  • Fixed a parser error in multiple modes not checking for return code, resulting in negative memory index writes
  • Fixed a problem with changed current working directory, for instance by using --restore together with --remove
  • Fixed a problem with the conversion to the $HEX[] format: convert/hexify also all passwords of the format $HEX[]
  • Fixed the calculation of device_name_chksum; should be done for each iteration
  • Fixed the dictstat lookup if nanoseconds are used in timestamps for the cached files
  • Fixed the estimated time value whenever the value is very large and overflows
  • Fixed the output of --show when used together with the collider modes -m 9710, 9810 or 10410
  • Fixed the parsing of command line options. It doesn't show two times the same error about an invalid option anymore
  • Fixed the parsing of DCC2 hashes by allowing the "#" character within the user name
  • Fixed the parsing of descrypt hashes if the hashes do have non-standard characters within the salt
  • Fixed the use of --veracrypt-pim option. It was completely ignored without showing an error
  • Fixed the version number used in the restore file header


Improvements:
  • Autotune: Do a pre-autotune test run to find out if kernel runtime is above some TDR limit
  • Charset: Add additional DES charsets with corrected parity
  • OpenCL Buffers: Do not allocate memory for amplifiers for fast hashes, it's simply not needed
  • OpenCL Kernels: Improved performance of SHA-3 Kernel (keccak) by hardcoding the 0x80 stopbit
  • OpenCL Kernels: Improved rule engine performance by 6% on for NVidia
  • OpenCL Kernels: Move from ld.global.v4.u32 to ld.const.v4.u32 in _a3 kernels
  • OpenCL Kernels: Replace bitwise swaps with rotate() versions for AMD
  • OpenCL Kernels: Rewritten Keccak kernel to run fully on registers and partially reversed last round
  • OpenCL Kernels: Rewritten SIP kernel from scratch
  • OpenCL Kernels: Thread-count is set to hardware native count except if -w 4 is used then OpenCL maximum is used
  • OpenCL Kernels: Updated default scrypt TMTO to be ideal for latest NVidia and AMD top models
  • OpenCL Kernels: Vectorized tons of slow kernels to improve CPU cracking speed
  • OpenCL Runtime: Improved detection for AMD and NV devices on macOS
  • OpenCL Runtime: Improved performance on Intel MIC devices (Xeon PHI) on runtime level (300MH/s to 2000MH/s)
  • OpenCL Runtime: Updated AMD ROCm driver version check, warn if version < 1.1
  • Show cracks: Improved the performance of --show/--left if used together with --username
  • Startup: Add visual indicator of active options when benchmarking
  • Startup: Check and abort session if outfile and wordlist point to the same file
  • Startup: Show some attack-specific optimizer constraints on start, eg: minimum and maximum support password- and salt-length
  • WPA cracking: Improved nonce-error-corrections mode to use a both positive and negative corrections


Technical:
  • General: Update C standard from c99 to gnu99
  • Hash Parser: Improved salt-length checks for generic hash modes
  • HCdict File: Renamed file from hashcat.hcdict to hashcat.hcdict2 and add header because versions are incompatible
  • HCstat File: Add code to read LZMA compressed hashcat.hcstat2
  • HCstat File: Add hcstat2 support to enable masks of length up to 256, also adds a filetype header
  • HCstat File: Renamed file from hashcat.hcstat to hashcat.hcstat2 and add header because versions are incompatible
  • HCtune File: Remove apple related GPU entries to workaround Trap 6 error
  • OpenCL Kernels: Added code generator for most of the switch_* functions and replaced existing code
  • OpenCL Kernels: Declared all include functions as static to reduce binary kernel cache size
  • OpenCL Kernels: On AMD GPU, optimized kernels for use with AMD ROCm driver
  • OpenCL Kernels: Removed some include functions that are no longer needed to reduce compile time
  • OpenCL Runtime: Fall back to 64 threads default (from 256) on AMD GPU to prevent creating too many workitems
  • OpenCL Runtime: Forcing OpenCL 1.2 no longer needed. Option removed from build options
  • OpenCL Runtime: On AMD GPU, recommend AMD ROCm driver for Linux
  • Restore: Fixed the version number used in the restore file header
  • Time: added new type for time measurements hc_time_t and related functions to force the use of 64 bit times


- atom
#2
A *lot* of work behind the scenes to make longer passwords possible. Thank you, atom!

(And don't forget: if you don't need longer passwords, always remember to add -O now!)

Selected benchmarks (6x 1080, -O, -w4):

Code:
Benchmark relevant options:
===========================
* --optimized-kernel-enable
* --workload-profile=4


Hashmode: 0 - MD5

Speed.Dev.#1.....: 25785.1 MH/s (208.47ms)
Speed.Dev.#2.....: 25341.8 MH/s (212.31ms)
Speed.Dev.#3.....: 25529.2 MH/s (210.42ms)
Speed.Dev.#4.....: 25519.2 MH/s (211.08ms)
Speed.Dev.#5.....: 25609.2 MH/s (209.94ms)
Speed.Dev.#6.....: 25625.7 MH/s (209.77ms)
Speed.Dev.#*.....:   153.4 GH/s

Hashmode: 1000 - NTLM

Speed.Dev.#1.....: 43298.8 MH/s (123.90ms)
Speed.Dev.#2.....: 42221.4 MH/s (126.47ms)
Speed.Dev.#3.....: 43121.5 MH/s (124.94ms)
Speed.Dev.#4.....: 42907.5 MH/s (125.24ms)
Speed.Dev.#5.....: 42746.5 MH/s (124.83ms)
Speed.Dev.#6.....: 42805.2 MH/s (124.69ms)
Speed.Dev.#*.....:   257.1 GH/s

Hashmode: 2500 - WPA/WPA2

Speed.Dev.#1.....:   413.7 kH/s (390.55ms)
Speed.Dev.#2.....:   406.1 kH/s (399.43ms)
Speed.Dev.#3.....:   412.8 kH/s (392.60ms)
Speed.Dev.#4.....:   409.2 kH/s (395.24ms)
Speed.Dev.#5.....:   412.1 kH/s (393.24ms)
Speed.Dev.#6.....:   412.0 kH/s (392.62ms)
Speed.Dev.#*.....:  2466.0 kH/s

Hashmode: 5600 - NetNTLMv2

Speed.Dev.#1.....:  1812.0 MH/s (368.30ms)
Speed.Dev.#2.....:  1779.2 MH/s (375.71ms)
Speed.Dev.#3.....:  1797.9 MH/s (370.41ms)
Speed.Dev.#4.....:  1789.6 MH/s (371.93ms)
Speed.Dev.#5.....:  1799.5 MH/s (370.97ms)
Speed.Dev.#6.....:  1802.4 MH/s (369.35ms)
Speed.Dev.#*.....: 10780.6 MH/s

Full benchmarks: standard format, and then summarized and sorted by speed
~
#3
Again, thanks a lot atom & the team!
I will test it ASAP.
#4
thanks a lot Big Grin
#5
thank you.
#6
Even tho I have a mac and it's slow because it's using the default video card it's a little faster now. Not like other users on here with there mega builds.

OpenCL Platform #1: Apple
=========================
* Device #1: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz, skipped.
* Device #2: Iris Pro, 384/1536 MB allocatable, 40MCU

Benchmark relevant options:
===========================
* --optimized-kernel-enable

Hashmode: 2500 - WPA/WPA2

Speed.Dev.#2.....: 8353 H/s (74.26ms)


Thinking of getting a alien laptop 15 with GeForce GTX 1070 maybe that's a little faster now that 4.0 came out.
#7
This is first time we can make use of the new version schemantics to announce a pure bugfix release v4.0.1.

Here's the changes:
  • Fixed a memory leak while parsing a wordlist
  • Fixed compile of kernels on AMD systems on windows due to invalid detection of ROCm
  • Fixed compile of sources using clang under MSYS2
  • Fixed overlapping memory segment copy in CPU rule engine if using a specific rule function
  • Fixed a parallel build problem when using the "install" Makefile target
  • Fixed the version number extraction for github releases which do not including the .git directory
#8
big thx for your work Smile
#9
I haven't put the rules files under a microscope yet, but have any "pre-fab" rules been adjusted so as to accommodate passwords now being up to 256 characters in length for some kernel types?

I suspect that some of the rules files are going to require adjustment.

*goes to investigate*
#10
Fair question - though it's also the case that existing rules will now produce results that wouldn't have worked before.

In other words, if you run your existing rules and lists against your unfound lists, you'll get hits that you didn't get before.
~