cudaHashCat64 on AWS EC2
#1
Tongue 
Hey All,

Worked on a fun project and thought this may help someone else out who's looking for some serious hardware to crack on. Theres a couple of other guides out there but they are pretty out dated and some of the AMIs dont even work any more.

As a side note, there is no way to monitor your GPU cleanly through the AWS console, but you can push custom metrics with CloudWatch. It will take some custom scripting and i'm not the best at virualization layers. I'm also just getting into hashcat, so I apologize if the test is not tuned.

STEPS:
1. Sign up for AWS, check your wallet and make sure you have enough funds to run a g2.2xlarge
You can find pricing here:

2. Launch an Amazon Linux AMI (the one I used was ami-146e2a7c) using the g2.2xlarge and configure whatever else you want on the instance (storage, tag, etc).

3. Run a "sudo yum update"

4. Run "lspci" to check the host info:

lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371SB PIIX3 ISA [Natoma/Triton II]
00:01.1 IDE interface: Intel Corporation 82371SB PIIX3 IDE [Natoma/Triton II]
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 01)
00:02.0 VGA compatible controller: Cirrus Logic GD 5446
00:03.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K520] (rev a1)
00:1f.0 Unassigned class [ff80]: XenSource, Inc. Xen Platform Device (rev 01)

Used this line to find the hardware info:
00:03.0 VGA compatible controller: NVIDIA Corporation GK104GL [GRID K520] (rev a1)

5. Downloaded the driver from Nvidia:
wget http://us.download.nvidia.com/XFree86/Li...346.35.run

6. Change the permissions:
chmod +x NVIDIA-Linux-x86_64-346.35.run

7. NOTE: I originally had to do a yum install kernel* to get this to work, but later after trying it again on another instance, I did not need to.
yum install kernel*
reboot

8. Install dev tools:
yum groupinstall development tools

9. Install Nvidia drivers:
./NVIDIA-Linux-x86_64-346.35.run

10. Edit yum.repos.d to add some repos in on Amazon Linux:
nano /etc/yum.repos.d/

Modify /etc/yum.repos.d/epel.repo. Under the section marked [epel], change enabled=0 to enabled=1.

11. Install p7zip:
yum install p7zip

12: Grab hashcat:
wget http://hashcat.net/files/cudaHashcat-1.33.7z

13: Unzip:
7za x cudaHashcat-1.33.7z

Once that was done I ran a benchmark just to test.

BENCHMARK:
cudaHashcat v1.33 starting in benchmark-mode...

Device #1: GRID K520, 4095MB, 797Mhz, 8MCU

Hashtype: MD4
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 4003.8 MH/s

Hashtype: MD5
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 2501.7 MH/s

Hashtype: SHA1
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 688.3 MH/s

Hashtype: SHA256
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 296.4 MH/s

Hashtype: SHA384
Workload: 256 loops, 256 accel

Speed.GPU.#1.: 71293.0 kH/s

Hashtype: SHA512
Workload: 256 loops, 256 accel

Speed.GPU.#1.: 71354.4 kH/s

Hashtype: SHA-3(Keccak)
Workload: 128 loops, 32 accel


Speed.GPU.#1.: 69719.9 kH/s

Hashtype: RipeMD160
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 493.6 MH/s

Hashtype: Whirlpool
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 52330.1 kH/s

Hashtype: GOST R 34.11-94
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 42608.3 kH/s

Hashtype: SAP CODVN B (BCODE)
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 228.5 MH/s

Hashtype: SAP CODVN F/G (PASSCODE)
Workload: 1024 loops, 32 accel

Speed.GPU.#1.: 105.8 MH/s

Hashtype: SAP CODVN H (PWDSALTEDHASH) iSSHA-1
Workload: 1024 loops, 16 accel

Speed.GPU.#1.: 602.6 kH/s

Hashtype: Lotus Notes/Domino 5
Workload: 256 loops, 32 accel

Speed.GPU.#1.: 27998.6 kH/s

Hashtype: Lotus Notes/Domino 6
Workload: 256 loops, 32 accel

Speed.GPU.#1.: 9213.8 kH/s

Hashtype: Lotus Notes/Domino 8
Workload: 5000 loops, 64 accel

Speed.GPU.#1.: 72334 H/s

Hashtype: SHA-1(Base64), nsldap, Netscape LDAP SHA
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 688.2 MH/s

Hashtype: SSHA-1(Base64), nsldaps, Netscape LDAP SSHA
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 688.2 MH/s

Hashtype: descrypt, DES(Unix), Traditional DES
Workload: 128 loops, 256 accel

Speed.GPU.#1.: 24213.2 kH/s

Hashtype: md5crypt, MD5(Unix), FreeBSD MD5, Cisco-IOS MD5
Workload: 1000 loops, 32 accel

Speed.GPU.#1.: 1277.3 kH/s

Hashtype: sha256crypt, SHA256(Unix)
Workload: 5000 loops, 4 accel

Speed.GPU.#1.: 43965 H/s

Hashtype: sha512crypt, SHA512(Unix)
Workload: 5000 loops, 8 accel

Speed.GPU.#1.: 13402 H/s

Hashtype: bcrypt, Blowfish(OpenBSD)
Workload: 32 loops, 2 accel

Speed.GPU.#1.: 501 H/s

Hashtype: LM
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 294.3 MH/s

Hashtype: Oracle 11g/12c
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 687.8 MH/s

Hashtype: NTLM
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 4002.9 MH/s

Hashtype: DCC, mscash
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 1183.6 MH/s

Hashtype: NetNTLMv1-VANILLA / NetNTLMv1+ESS
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 1605.2 MH/s

Hashtype: NetNTLMv2
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 179.6 MH/s

Hashtype: Kerberos 5 AS-REQ Pre-Auth etype 23
Workload: 256 loops, 32 accel

Speed.GPU.#1.: 6156.8 kH/s

Hashtype: EPiServer 6.x < v4
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 362.7 MH/s

Hashtype: EPiServer 6.x > v4
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 273.1 MH/s

Hashtype: MSSQL(2000)
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 672.4 MH/s

Hashtype: MSSQL(2005)
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 672.1 MH/s

Hashtype: MSSQL(2012)
Workload: 256 loops, 256 accel

Speed.GPU.#1.: 71021.7 kH/s

Hashtype: MySQL323
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 8386.5 MH/s

Hashtype: MySQL4.1/MySQL5
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 326.2 MH/s

Hashtype: Oracle 7-10g
Workload: 512 loops, 32 accel

Speed.GPU.#1.: 115.8 MH/s

Hashtype: Sybase ASE
Workload: 512 loops, 32 accel

Speed.GPU.#1.: 32923.7 kH/s

Hashtype: Oracle 11g/12c
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 688.1 MH/s

Hashtype: OSX v10.4, v10.5, v10.6
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 362.7 MH/s

Hashtype: OSX v10.7
Workload: 128 loops, 256 accel

Speed.GPU.#1.: 68623.2 kH/s

Hashtype: OSX v10.8 / v10.9
Workload: 35000 loops, 2 accel

Speed.GPU.#1.: 827 H/s

Hashtype: Android PIN
Workload: 1024 loops, 16 accel

Speed.GPU.#1.: 612.3 kH/s

Hashtype: Android FDE <= 4.3
Workload: 2000 loops, 32 accel

Speed.GPU.#1.: 87108 H/s

Hashtype: scrypt
Workload: 1 loops, 64 accel

Speed.GPU.#1.: 25146 H/s

Hashtype: Cisco-PIX MD5
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 1894.4 MH/s

Hashtype: Cisco-ASA MD5
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 1911.4 MH/s

Hashtype: Cisco-IOS SHA256
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 296.3 MH/s

Hashtype: Cisco $8$
Workload: 20000 loops, 8 accel

Speed.GPU.#1.: 5866 H/s

Hashtype: Cisco $9$
Workload: 1 loops, 4 accel

Speed.GPU.#1.: 939 H/s

Hashtype: Juniper IVE
Workload: 1000 loops, 32 accel

Speed.GPU.#1.: 1276.1 kH/s

Hashtype: Citrix NetScaler
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 763.6 MH/s

Hashtype: DNSSEC (NSEC3)
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 343.5 MH/s

Hashtype: WPA/WPA2
Workload: 4096 loops, 32 accel

Speed.GPU.#1.: 42815 H/s

Hashtype: IKE-PSK MD5
Workload: 512 loops, 32 accel

Speed.GPU.#1.: 204.4 MH/s

Hashtype: IKE-PSK SHA1
Workload: 512 loops, 32 accel

Speed.GPU.#1.: 74340.2 kH/s

Hashtype: Password Safe v2
Workload: 1000 loops, 16 accel

Speed.GPU.#1.: 10326 H/s

Hashtype: Password Safe v3
Workload: 2048 loops, 16 accel

Speed.GPU.#1.: 117.8 kH/s

Hashtype: 1Password, agilekeychain
Workload: 1000 loops, 64 accel

Speed.GPU.#1.: 357.2 kH/s

Hashtype: 1Password, cloudkeychain
Workload: 40000 loops, 2 accel

Speed.GPU.#1.: 719 H/s

Hashtype: AIX {ssha1}
Workload: 64 loops, 128 accel

Speed.GPU.#1.: 4406.1 kH/s

Hashtype: TrueCrypt 5.0+ PBKDF2-HMAC-RipeMD160 + AES
Workload: 2000 loops, 64 accel

Speed.GPU.#1.: 118.8 kH/s

Hashtype: TrueCrypt 5.0+ PBKDF2-HMAC-SHA512 + AES
Workload: 1000 loops, 16 accel

Speed.GPU.#1.: 33061 H/s

Hashtype: TrueCrypt 5.0+ PBKDF2-HMAC-Whirlpool + AES
Workload: 1000 loops, 8 accel

Speed.GPU.#1.: 6018 H/s

Hashtype: TrueCrypt 5.0+ PBKDF2-HMAC-RipeMD160 + AES + boot-mode
Workload: 1000 loops, 64 accel

Speed.GPU.#1.: 234.9 kH/s

Hashtype: Office 2007
Workload: 50000 loops, 32 accel

Speed.GPU.#1.: 14264 H/s

Hashtype: Office 2010
Workload: 100000 loops, 32 accel

Speed.GPU.#1.: 7146 H/s

Hashtype: Office 2013
Workload: 100000 loops, 4 accel

Speed.GPU.#1.: 696 H/s

Hashtype: MS Office <= 2003 MD5 + RC4, oldoffice$0, oldoffice$1
Workload: 1024 loops, 32 accel

Speed.GPU.#1.: 6149.4 kH/s

Hashtype: MS Office <= 2003 SHA1 + RC4, oldoffice$3, oldoffice$4
Workload: 1024 loops, 32 accel

Speed.GPU.#1.: 8644.6 kH/s

Hashtype: PDF 1.1 - 1.3 (Acrobat 2 - 4)
Workload: 1024 loops, 32 accel

Speed.GPU.#1.: 0 H/s

Hashtype: PDF 1.1 - 1.3 (Acrobat 2 - 4) + collider-mode #1
Workload: 1024 loops, 32 accel

Speed.GPU.#1.: 0 H/s

Hashtype: PDF 1.1 - 1.3 (Acrobat 2 - 4) + collider-mode #2
Workload: 1024 loops, 32 accel

Speed.GPU.#1.: 385.6 MH/s

Hashtype: PDF 1.4 - 1.6 (Acrobat 5 - 8)
Workload: 70 loops, 256 accel

Speed.GPU.#1.: 36419 H/s

Hashtype: PDF 1.7 Level 3 (Acrobat 9)
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 296.4 MH/s

Hashtype: PDF 1.7 Level 8 (Acrobat 10 - 11)
Workload: 64 loops, 8 accel

Speed.GPU.#1.: 3769 H/s

Hashtype: Drupal7
Workload: 16384 loops, 8 accel

Speed.GPU.#1.: 4258 H/s

Hashtype: HMAC-MD5 (key = $pass)
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 247.2 MH/s

Hashtype: HMAC-MD5 (key = $salt)
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 545.4 MH/s

Hashtype: HMAC-SHA1 (key = $pass)
Workload: 256 loops, 256 accel

Speed.GPU.#1.: 81708.7 kH/s

Hashtype: HMAC-SHA1 (key = $salt)
Workload: 256 loops, 256 accel

Speed.GPU.#1.: 165.8 MH/s

Hashtype: HMAC-SHA256 (key = $pass)
Workload: 128 loops, 128 accel

Speed.GPU.#1.: 58958.2 kH/s

Hashtype: HMAC-SHA256 (key = $salt)
Workload: 128 loops, 128 accel

Speed.GPU.#1.: 117.3 MH/s

Hashtype: HMAC-SHA512 (key = $pass)
Workload: 128 loops, 128 accel

Speed.GPU.#1.: 16953.4 kH/s

Hashtype: HMAC-SHA512 (key = $salt)
Workload: 128 loops, 128 accel

Speed.GPU.#1.: 33909.5 kH/s

Hashtype: IPMI2 RAKP HMAC-SHA1
Workload: 256 loops, 256 accel

Speed.GPU.#1.: 176.1 MH/s

Hashtype: Half MD5
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 1650.1 MH/s

Hashtype: Double MD5
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 737.9 MH/s

Hashtype: GRUB 2
Workload: 10000 loops, 2 accel

Speed.GPU.#1.: 2860 H/s

Hashtype: phpass, MD5(Wordpress), MD5(phpBB3), MD5(Joomla)
Workload: 2048 loops, 32 accel

Speed.GPU.#1.: 672.5 kH/s

Hashtype: SipHash
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 3281.4 MH/s

Hashtype: Joomla < 2.5.18
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 2503.7 MH/s

Hashtype: osCommerce, xt:Commerce
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 782.2 MH/s

Hashtype: IPB2+, MyBB1.2+
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 472.6 MH/s

Hashtype: vBulletin < v3.8.5
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 738.9 MH/s

Hashtype: PHPS
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 738.9 MH/s

Hashtype: vBulletin > v3.8.5
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 454.5 MH/s

Hashtype: SMF > v1.1
Workload: 512 loops, 256 accel

Speed.GPU.#1.: 362.8 MH/s

Started: Mon Mar 2 15:55:43 2015
Stopped: Mon Mar 2 16:21:54 2015
Reply
#2
K520 is 5.3x slower than a GTX 970 and 6x slower than a GTX 980, so probably not worth the cost.
Reply
#3
(03-03-2015, 12:13 AM)epixoip Wrote: K520 is 5.3x slower than a GTX 970 and 6x slower than a GTX 980, so probably not worth the cost.

For sure, just a small project I was working on. It's hard to beat the almost $0 startup cost. I can pay for several hundred hours for the same costs that it would take to start up something on my own (in a fraction of the time).

But agreed, in the long run, may not be worth it especially if the AWS cloud can not keep up with top end equipment.
Reply
#4
Hm.. speed WPA is similar Nvidia GTX 750 Ti

Price per hour g2.2xlarge - $0.650
Price for "hardware" Nvidia GTX 750 Ti - ~140$

Quote:can pay for several hundred hours for the same costs
140$/0.65$ = 215 hours = 9 days use EC2 g2.2xlarge is equal by price EVGA GeForce GTX 750Ti
I say yes. But if you buy "hardware card" you can use it again and again with no any additional costs. (except pay for electricity)

if i use 4 instanse g2.2xlarge on 10 days it take avg 168 kH/s and 0.65$ * 4 instance * 24 hours * 10 days =624$ !!! WOW!
better choice buy two GTX970 ! it take avg 230 kH/s and almost unlimited lifetime!
If you get tired, you always can sell your old GTX in the second hand(eBay) for reduce losses.

Unfortunately bruteforce never been "small project" it always long time computing tasks.
Reply
#5
just as a side note, I played with AWS a little, too.
Did you get any issues while downloading oclHashcat from the site?
I saw that, if I don't put any speed limit to the wget command, the site will drop the connection.
Maybe there's a protection vs bot downloading?

Finally, if you only work with "Spot instances", the price by hour is 0.065 $ (6c and half), so it's really cheap even if not very powerful.
Reply
#6
As of right now, I can get a g2.8xlarge spot instance for <$0.3/hr in my area.

Thanks for the tut, btw, it's exactly what I was looking for.
Only change is in hashcat version, but that was a simple fix. s/1.33.7z/1.36.7z/
This is a single g2.8xlarge instance.

cudaHashcat v1.36 starting in benchmark-mode...

Device #1: GRID K520, 4095MB, 797Mhz, 8MCU
Device #2: GRID K520, 4095MB, 797Mhz, 8MCU
Device #3: GRID K520, 4095MB, 797Mhz, 8MCU
Device #4: GRID K520, 4095MB, 797Mhz, 8MCU

Hashtype: MD4
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 3983.6 MH/s
Speed.GPU.#2.: 3987.5 MH/s
Speed.GPU.#3.: 3983.5 MH/s
Speed.GPU.#4.: 3986.5 MH/s
Speed.GPU.#*.: 15941.1 MH/s

Hashtype: MD5
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 2494.8 MH/s
Speed.GPU.#2.: 2495.5 MH/s
Speed.GPU.#3.: 2494.7 MH/s
Speed.GPU.#4.: 2496.2 MH/s
Speed.GPU.#*.: 9981.2 MH/s

Hashtype: SHA1
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 688.7 MH/s
Speed.GPU.#2.: 688.3 MH/s
Speed.GPU.#3.: 688.7 MH/s
Speed.GPU.#4.: 688.4 MH/s
Speed.GPU.#*.: 2754.0 MH/s

Hashtype: SHA256
Workload: 1024 loops, 256 accel

Speed.GPU.#1.: 296.3 MH/s
Speed.GPU.#2.: 296.3 MH/s
Speed.GPU.#3.: 296.2 MH/s
Speed.GPU.#4.: 296.3 MH/s
Speed.GPU.#*.: 1185.0 MH/s
Reply
#7
Those speeds are so dismal Tongue
Reply
#8
Wink 
(05-12-2015, 04:49 PM)epixoip Wrote: Those speeds are so dismal Tongue

For 30 cents to run a small analysis, I'm happy Big Grin

EDIT: Also, getting 240GB SSD storage and a 10GiB port is nice too Tongue
Reply
#9
It's not the same to have virtual machine than real hardware on hand. To tell otherwise is same like to shout about having hot night with inflatable woman. Your g2.8xlarge speed's lower than HD5970 which was released in ~2009 so not so nice.
Reply
#10
(05-12-2015, 09:18 PM)KT819GM Wrote: It's not the same to have virtual machine than real hardware on hand. To tell otherwise is same like to shout about having hot night with inflatable woman. Your g2.8xlarge speed's lower than HD5970 which was released in ~2009 so not so nice.

Perhaps I should explain..

I am not claiming that a g2.8xlarge EC2 instance is a viable alternative to hashing with modern GPUs. I am painfully aware that there is more up-to-date hardware out there. However, for my purposes it is a completely acceptable, and in fact the only, alternative I have.

My goal is not to crack the password to the Gibson, rather, spend an hour here and there testing projects that integrate networking. Perhaps spend an occasional weekend flipping bits.

Being a broke infosec undergrad, I do not have the money to drop on new hardware, and I won't any time in the next couple of years.

Finally I realize I can spend 10min crunching wordlists that would have taken days on my laptop, and for less than the money that is floating around my floorboard.

As a student, this has been an outstandingly helpful self-guided primer in cudaHashcat. I plan to use spot instances in the future for other small projects.

So as far as education goes, yes. Very nice.
Reply