05-26-2014, 04:27 PM
Download here: https://hashcat.net/oclhashcat/
This release is focused on performance increase / bugfixes
As always, make sure to unpack into a new folder. Never reuse an existing oclHashcat folder (because of the cached kernels).
Added Algorithms
Logging support
There's a new basic logfile support. It's just a start to find out if users like it or not. The log file is fomatted in a way that it is easily readable by human. The format hence is easy to parse too, each line contains a key-value pair separated by a tab character.
There will be more attributes coming in over time such as statistical information as speed and other useful information like cracked hashes. At this time, it's mostly default values, user-changed parameters, filenames of dictionary and hash.
Performance increase: New sm_50 instructions, NVidia only
With the GTX 750ti model, NVidia made an extreme move into our direction. This is the first card that supports the new sm_50 low-level instructions. For detailed information about the new sm_50 instructions, Ivan Golubev wrote an excellent blog post on here: https://www.golubev.com/blog/?p=291
To make it short, in the past AMD had a strong advantage over NVidia with their instructions BFI_INT and BIT_ALIGN. Those instructions are very useful in crypto. NVidia draw the level with the new LOP3.LUT and the SHF instruction (that was already introduced with sm_35).
Actually, the LOP3.LUT has advantage over BFI_INT. It's much more flexible and can be used in other cases as well. Additionally, NVidia added another instruction "IADD3" that can add 3 integers all at once and store the result in a fourth integer. This instruction is also very useful in crypto.
The GTX 750ti is a low-budget card, but with the next high-end card that bases on sm_50 NVidia it could be a game-changer. Also note that NVidia has much better multihash performance, not to mention the driver...
Here's how performance increased with the new low-level optimizations (yes, they will work with any other future high-end sm_50 card):
Performance increase: Workaround bad integration of 64-bit integer operations, AMD and NVidia
All the 64-bit based algorithms like SHA512, Keccak etc dropped in performance with each new driver a little bit. So it was hard to notice. GPUs instructions operate still on 32-bit only, so the 64-bit mode is emulated. But the way how it is emulated was somehow broken. I was able to pinpoint the problem where the biggest drop came from and I managed to workaround it. For NVidia it took me a little PTX hack, for AMD luckily there was no binary hack required. While this optimization is not for all algorithm as the sm_50 optimization was, therefore it works on all cards, not just the sm_50. Not much to say, let the numbers do the talk:
Changes for GCN hd7xxx / R9 series
Changes for hd5xxx / hd6xxx / Non-GCN hd7xxx series
Full Changeset
--
atom
This release is focused on performance increase / bugfixes
As always, make sure to unpack into a new folder. Never reuse an existing oclHashcat folder (because of the cached kernels).
Added Algorithms
- PHPS
- Lotus Notes/Domino 5
- Lotus Notes/Domino 6
Logging support
There's a new basic logfile support. It's just a start to find out if users like it or not. The log file is fomatted in a way that it is easily readable by human. The format hence is easy to parse too, each line contains a key-value pair separated by a tab character.
There will be more attributes coming in over time such as statistical information as speed and other useful information like cracked hashes. At this time, it's mostly default values, user-changed parameters, filenames of dictionary and hash.
Performance increase: New sm_50 instructions, NVidia only
With the GTX 750ti model, NVidia made an extreme move into our direction. This is the first card that supports the new sm_50 low-level instructions. For detailed information about the new sm_50 instructions, Ivan Golubev wrote an excellent blog post on here: https://www.golubev.com/blog/?p=291
To make it short, in the past AMD had a strong advantage over NVidia with their instructions BFI_INT and BIT_ALIGN. Those instructions are very useful in crypto. NVidia draw the level with the new LOP3.LUT and the SHF instruction (that was already introduced with sm_35).
Actually, the LOP3.LUT has advantage over BFI_INT. It's much more flexible and can be used in other cases as well. Additionally, NVidia added another instruction "IADD3" that can add 3 integers all at once and store the result in a fourth integer. This instruction is also very useful in crypto.
The GTX 750ti is a low-budget card, but with the next high-end card that bases on sm_50 NVidia it could be a game-changer. Also note that NVidia has much better multihash performance, not to mention the driver...
Here's how performance increased with the new low-level optimizations (yes, they will work with any other future high-end sm_50 card):
Performance increase: Workaround bad integration of 64-bit integer operations, AMD and NVidia
All the 64-bit based algorithms like SHA512, Keccak etc dropped in performance with each new driver a little bit. So it was hard to notice. GPUs instructions operate still on 32-bit only, so the 64-bit mode is emulated. But the way how it is emulated was somehow broken. I was able to pinpoint the problem where the biggest drop came from and I managed to workaround it. For NVidia it took me a little PTX hack, for AMD luckily there was no binary hack required. While this optimization is not for all algorithm as the sm_50 optimization was, therefore it works on all cards, not just the sm_50. Not much to say, let the numbers do the talk:
Changes for GCN hd7xxx / R9 series
Changes for hd5xxx / hd6xxx / Non-GCN hd7xxx series
Full Changeset
Quote:
Type: Feature
File: Host
Desc: Added support for algorithm -m 2612 = PHPS
Type: Feature
File: Kernel
Desc: Added support for algorithm -m 8600 = Lotus Notes/Domino 5
Type: Feature
File: Kernel
Desc: Added support for algorithm -m 8700 = Lotus Notes/Domino 6
Type: Workaround
File: Kernel
Desc: Fixed performance drop on descrypt, LM and oracle-old initiated by AMD drivers
Type: Workaround
File: Host
Desc: Fixed problem with restoring ADL performance state when the clock size reported by the AMD driver didn't respect the clock step size
Trac: #435
Type: Workaround
File: Host
Desc: Fixed problem with setting ADL powertune value for r9 295x2 GPUs
Trac: #438
Type: Feature
File: Host
Desc: Added support for writing logfiles
Trac: #420
Type: Feature
File: Host
Desc: Added parameter --logfile-disable which should be self-explaining
Trac: #420
Type: Change
File: Host
Desc: Dictstat is now no longer session dependent and will always be based on oclHashcat installation directory
Trac: #437
Type: Change
File: Host
Desc: Use AMD custom profile settings instead of basing the AMD powertune/clock settings on maximum supported clock values
Trac: #433
Type: Bug
File: Host
Desc: Fixed VLIW size calculated by compute capability was broken for sm_50 -> cuModuleLoad() 301
Type: Bug
File: Host
Desc: Make --runtime count relative to real attack start not program start
Type: Bug
File: Host
Desc: Fixed bug with fan speed handling, if fan speed is manually set to a high enought value (e.g. 100%) oclHashcat shouldn't change it
Trac: #439
Type: Bug
File: Host
Desc: Problem with username parsing (--username) was fixed
Trac: #441
Type: Bug
File: Host
Desc: Fixed problem where IKE-PSK sha1/md5 (-m 5300/-m 5400) were wrongly recognized as shadow file formats
Trac: #443
Type: Bug
File: Host
Desc: Fixed problem where the 'delete range' rule (xNM) did not allow to remove charaters at the very end of the word
Trac: #444
--
atom