+55 minutes in Generating Dictionary for 194GB
#1
Wink 
Hi all,
I'm a newbie (so excuse for any mistake I can generate) :-)

I need to recover my ethereum wallet password. I remember the password structure, the words and the combinations but not the way I used all togheter; so I created a 194GB dictionary and turned on a AWS machine with tesla K80.

Installed cudaHash on Amazon Linux and nVidia is OK and any test is ok...or at least it seems to be to my eyes; then I benchmarked cudaHashcat64.bin and it is ok.

Now I used

./cudaHashcat64.bin -m 5000 -a 0 wallet.hash /dictionary/verybigwordlist.txt

I'm pretty sure I have made everything correct but I was not able to understand why I'm waiting (now) 55 minutes for "Generating dictionary stats for /dictionary/verybigwordlist.txt: 20537007704 bytes (1.00%), 347517734 words, 347517730 keyspace".

I'd like to know if I did something wrong or everything is ok: just this !
Also ff some one has documentation that explain this, please point to me so I can understand by my own.
Thanks !!
#2
Have you looked into hashcat's rules and masks? You may be able to significantly reduced the size of your wordlist, in exchange for faster coverage of your likely password space.

For example, if your wordlist is doing things like appending numbers, toggling case, leet-ifying words, etc. then see the rules/ subdirectory for rule lists that you could use to do this instead.

Also, it looks like you're using cudaHashcat, which has been superseded (now it's all just hashcat, and needs OpenCL). You'll probably want to upgrade to the most current release (3.40) and install an OpenCL runtime.
~
#3
(03-19-2017, 06:54 AM)royce Wrote: Have you looked into hashcat's rules and masks? You may be able to significantly reduced the size of your wordlist, in exchange for faster coverage of your likely password space.

For example, if your wordlist is doing things like appending numbers, toggling case, leet-ifying words, etc. then see the rules/ subdirectory for  rule lists that you could use to do this instead.

Also, it looks like you're using cudaHashcat, which has been superseded (now it's all just hashcat, and needs OpenCL). You'll probably want to upgrade to the most current release (3.40) and install an OpenCL runtime.

Thank you I'll try to practice everything.
#4
by the way, it doesn't seem that ethereum uses "just sha3", see https://github.com/ethereum/cpp-ethereum...#L378-L418

It seems that the SHA3 () step is just performed after either the pbkdf2/scrypt hashing and can only be used to validate if the previous steps were correct.

In other words, the full algorithm doesn't seem to be "just" SHA3(pass) as one might think from your original post @dindolo1979.
#5
To elaborate on what @philsmd is saying here is the answer I got when asking whether the private key was a simply SHA3 hash...

----------------------------------------------------------------------------------------------------------------------------------------------
"No. Your bkp is not the SHA3 of your password.

It's really quite simple. In the beginning, god said genwallet and...

genwallet says:

genwallet(opts['seed'],pw,email)
You say "here's my email and pw"

seed says "give me super random number":

seed = random_key().decode('hex') # uses pybitcointools' 3-source random generator
so now you need to get your encseed:

encseed = aes.encryptData(pbkdf2(pw),seed)
so we head over to mr. aes and say whats up:

def encryptData(key, data, mode=AESModeOfOperation.modeOfOperation["CBC"], iv=None):
........
........
and now you have your encseed.

then you get the `ethpriv:

ethpriv = sha3(seed)
and your address:

ethaddr = sha3(privtopub(ethpriv)[1:])[12:].encode('hex')
and finally your bkp:

bkp = sha3(seed + '\x02').encode('hex')
So your bkp is the sha3 of your seed plus essentially the number "2" (number "1" was used for your btcpriv to differentiate it from your ethpriv) encoded in hex.

The bkp is a backup obviously. But its not a backup of your password. Its a backup of your seed.

print "Your seed is:", getseed(b['withwallet'],w['bkp'],b['ethaddr'])

leads to...

def recover_bkp_pw(bkp,pw):
return getseed(bkp['withpw'],pw,bkp['ethaddr'])
....
"withpw": aes.encryptData(pbkdf2(pw),seed).encode('hex'),
or...

def recover_bkp_wallet(bkp,wallet):
return getseed(bkp['withwallet'],wallet['bkp'],bkp['ethaddr'])
...
"withwallet": aes.encryptData(pbkdf2(wallet['bkp']),seed).encode('hex'),
Get it now?"
----------------------------------------------------------------------------------------------------------------------------------------------

Only helped me as much in realising how much more complicated it was than I'd hoped.

If you say "I remember the password structure, the words and the combinations but not the way I used all togheter" What you might find helpful is here...
https://github.com/lexansoft/ethcracker

CPU only and no masks wildcards (which is what I really need).

Hope this is of help and please post how you go, I'm in the same boat and looking for a solution.

And if anyone else reading this could be bothered checking out the GitHub link I posted and know if any optimisations could be achieved I'd be grateful.
#6
What can I say beyond a "THANK YOU" ? 
I'll study and I'll try to solve this clue.


(03-26-2017, 03:28 PM)Villan Wrote: To elaborate on what @philsmd is saying here is the answer I got when asking whether the private key was a simply SHA3 hash...

----------------------------------------------------------------------------------------------------------------------------------------------
"No. Your bkp is not the SHA3 of your password.

It's really quite simple. In the beginning, god said genwallet and...

genwallet says:

genwallet(opts['seed'],pw,email)
You say "here's my email and pw"

seed says "give me super random number":

seed = random_key().decode('hex') # uses pybitcointools' 3-source random generator
so now you need to get your encseed:

encseed = aes.encryptData(pbkdf2(pw),seed)
so we head over to mr. aes and say whats up:

def encryptData(key, data, mode=AESModeOfOperation.modeOfOperation["CBC"], iv=None):
........
........
and now you have your encseed.

then you get the `ethpriv:

ethpriv = sha3(seed)
and your address:

ethaddr = sha3(privtopub(ethpriv)[1:])[12:].encode('hex')
and finally your bkp:

bkp = sha3(seed + '\x02').encode('hex')
So your bkp is the sha3 of your seed plus essentially the number "2" (number "1" was used for your btcpriv to differentiate it from your ethpriv) encoded in hex.

The bkp is a backup obviously. But its not a backup of your password. Its a backup of your seed.

print "Your seed is:", getseed(b['withwallet'],w['bkp'],b['ethaddr'])

leads to...

def recover_bkp_pw(bkp,pw):
return getseed(bkp['withpw'],pw,bkp['ethaddr'])
....
"withpw": aes.encryptData(pbkdf2(pw),seed).encode('hex'),
or...

def recover_bkp_wallet(bkp,wallet):
return getseed(bkp['withwallet'],wallet['bkp'],bkp['ethaddr'])
...
"withwallet": aes.encryptData(pbkdf2(wallet['bkp']),seed).encode('hex'),
Get it now?"
----------------------------------------------------------------------------------------------------------------------------------------------

Only helped me as much in realising how much more complicated it was than I'd hoped.

If you say "I remember the password structure, the words and the combinations but not the way I used all togheter" What you might find helpful is here...
https://github.com/lexansoft/ethcracker

CPU only and no masks wildcards (which is what I really need).

Hope this is of help and please post how you go, I'm in the same boat and looking for a solution.

And if anyone else reading this could be bothered checking out the GitHub link I posted and know if any optimisations could be achieved I'd be grateful.
#7
@dindolo1979 Attention: I would be very careful with this small set of information you provided/got here.

As far as I understood the algorithm is much more simple/straight forward and doesn't need any AES etc steps just to validate the password.

If you are really interested in some more (technical) discussions and/or if you want that these algorithms (actually yes, there are 2 different algorithms at least!) to be added to hashcat, we should continue to collect some information and maybe you can contribute a little bit (with some more info, e.g. which file a user normally has etc, I'm thinking about e.g. the ~/.web3/keys/ files on linux, but I'm not too familiar with ethereum)...

This is what I got so far, a POC:
pbkdf2:
Code:
#!/usr/bin/env perl

# author: philsmd (for hashcat)
# date: april 2017

use strict;
use warnings;

use Crypt::PBKDF2;
use Digest::Keccak qw (keccak_256_hex);

#
# Algorithm can be found in: SecretStore::decrypt () in cpp-ethereum/libdevcrypto/SecretStore.cpp
# Examples can be found in:  cpp-ethereum/test/unittests/libdevcrypto/SecretStore.cpp
#

my $mac = "cf6bfbcc77142a22c4a908784b4a16f1023a1d0e2aff404c20158fa4f1587177"; # the "hash"

my $ciphertext = "d69313b6470ac1942f75d72ebf8818a0d484ac78478a132ee081cd954d6bd7a9";

# pbkdf2 params:

my $dklen = 32;
my $c = 262144; # iterations
my $salt = "c82ef14476014cbf438081a42709e2ed";

# pass:

# my $pass = "bar";

#
# Start
#

my $salt_bin = pack ("H*", $salt);

my $ciphertext_bin = pack ("H*", $ciphertext);

while (my $pass = <>)
{
  chomp ($pass);

  # pbkdf2:
  
  my $pbkdf2 = Crypt::PBKDF2->new
  (
    hasher     => Crypt::PBKDF2->hasher_from_algorithm ('HMACSHA2', 256),
    iterations => $c,
    out_len => $dklen
  );
  
  my $derived_key = $pbkdf2->PBKDF2 ($salt_bin, $pass);

  my $derived_key_cropped = substr ($derived_key, 16, 16);

  # SHA3 - keccak (needed for the "mac" check)
  
  my $mac_gen = keccak_256_hex ($derived_key_cropped . $ciphertext_bin);
  
  if ($mac_gen eq $mac)
  {
    print "Password found: '$pass'\n";
  }
}

how to run it:
Code:
echo bar | ./ethereum_pbkdf2.pl

scrypt:
Code:
#!/usr/bin/env perl

# author: philsmd (for hashcat)
# date: april 2017

use strict;
use warnings;

use Crypt::ScryptKDF qw (scrypt_raw);
use Digest::Keccak   qw (keccak_256_hex);

#
# Algorithm can be found in: SecretStore::decrypt () in cpp-ethereum/libdevcrypto/SecretStore.cpp
# Examples can be found in:  cpp-ethereum/test/unittests/libdevcrypto/SecretStore.cpp
#

my $mac = "2103ac29920d71da29f15d75b4a16dbe95cfd7ff8faea1056c33131d846e3097"; # the "hash"

my $ciphertext = "d172bf743a674da9cdad04534d56926ef8358534d458fffccd4e6ad2fbde479c";

# scrypt params:

my $dklen = 32;
my $n = 262144;
my $p = 8,
my $r = 1,
my $salt = "ab0c7876052600dd703518d6fc3fe8984592145b591fc8fb5c6d43190334ba19";

# pass:

# my $pass = "testpassword";

#
# Start
#

my $salt_bin = pack ("H*", $salt);

my $ciphertext_bin = pack ("H*", $ciphertext);

while (my $pass = <>)
{
  chomp ($pass);

  # scrypt:

  my $derived_key = scrypt_raw ($pass, $salt_bin, $n, $r, $p, $dklen);

  my $derived_key_cropped = substr ($derived_key, 16, 16);

  # SHA3 - keccak (needed for the "mac" check)

  my $mac_gen = keccak_256_hex ($derived_key_cropped . $ciphertext_bin);

  if ($mac_gen eq $mac)
  {
    print "Password found: '$pass'\n";
  }
}

how to run it:
Code:
echo testpassword | ./ethereum_scrypt.pl

(examples, as mentioned within the code, are from: cpp-ethereum/test/unittests/libdevcrypto/SecretStore.cpp)

Note: the code is in perl, but it wouldn't be impossible to add GPU support with hashcat, but we need to clarify a lot of things first, actually there is already a github issue here: https://github.com/hashcat/hashcat/issues/262 (with very little information about the algorithm)
#8
Wow !! Thank you.
Ok I'll collect info and give a better definition of the scenario as well as I can.
I'll try the perl code you wrote and try to contribute asap.

Thanks

(04-07-2017, 02:03 PM)philsmd Wrote: @dindolo1979 Attention: I would be very careful with this small set of information you provided/got here.

As far as I understood the algorithm is much more simple/straight forward and doesn't need any AES etc steps just to validate the password.

If you are really interested in some more (technical) discussions and/or if you want that these algorithms (actually yes, there are 2 different algorithms at least!) to be added to hashcat, we should continue to collect some information and maybe you can contribute a little bit (with some more info, e.g. which file a user normally has etc, I'm thinking about e.g. the ~/.web3/keys/ files on linux, but I'm not too familiar with ethereum)...

This is what I got so far, a POC:
pbkdf2:
Code:
#!/usr/bin/env perl

# author: philsmd (for hashcat)
# date: april 2017

use strict;
use warnings;

use Crypt::PBKDF2;
use Digest::Keccak qw (keccak_256_hex);

#
# Algorithm can be found in: SecretStore::decrypt () in cpp-ethereum/libdevcrypto/SecretStore.cpp
# Examples can be found in:  cpp-ethereum/test/unittests/libdevcrypto/SecretStore.cpp
#

my $mac = "cf6bfbcc77142a22c4a908784b4a16f1023a1d0e2aff404c20158fa4f1587177"; # the "hash"

my $ciphertext = "d69313b6470ac1942f75d72ebf8818a0d484ac78478a132ee081cd954d6bd7a9";

# pbkdf2 params:

my $dklen = 32;
my $c = 262144; # iterations
my $salt = "c82ef14476014cbf438081a42709e2ed";

# pass:

# my $pass = "bar";

#
# Start
#

my $salt_bin = pack ("H*", $salt);

my $ciphertext_bin = pack ("H*", $ciphertext);

while (my $pass = <>)
{
 chomp ($pass);

 # pbkdf2:
 
 my $pbkdf2 = Crypt::PBKDF2->new
 (
   hasher     => Crypt::PBKDF2->hasher_from_algorithm ('HMACSHA2', 256),
   iterations => $c,
   out_len => $dklen
 );
 
 my $derived_key = $pbkdf2->PBKDF2 ($salt_bin, $pass);

 my $derived_key_cropped = substr ($derived_key, 16, 16);

 # SHA3 - keccak (needed for the "mac" check)
 
 my $mac_gen = keccak_256_hex ($derived_key_cropped . $ciphertext_bin);
 
 if ($mac_gen eq $mac)
 {
   print "Password found: '$pass'\n";
 }
}

how to run it:
Code:
echo bar | ./ethereum_pbkdf2.pl

scrypt:
Code:
#!/usr/bin/env perl

# author: philsmd (for hashcat)
# date: april 2017

use strict;
use warnings;

use Crypt::ScryptKDF qw (scrypt_raw);
use Digest::Keccak   qw (keccak_256_hex);

#
# Algorithm can be found in: SecretStore::decrypt () in cpp-ethereum/libdevcrypto/SecretStore.cpp
# Examples can be found in:  cpp-ethereum/test/unittests/libdevcrypto/SecretStore.cpp
#

my $mac = "2103ac29920d71da29f15d75b4a16dbe95cfd7ff8faea1056c33131d846e3097"; # the "hash"

my $ciphertext = "d172bf743a674da9cdad04534d56926ef8358534d458fffccd4e6ad2fbde479c";

# scrypt params:

my $dklen = 32;
my $n = 262144;
my $p = 8,
my $r = 1,
my $salt = "ab0c7876052600dd703518d6fc3fe8984592145b591fc8fb5c6d43190334ba19";

# pass:

# my $pass = "testpassword";

#
# Start
#

my $salt_bin = pack ("H*", $salt);

my $ciphertext_bin = pack ("H*", $ciphertext);

while (my $pass = <>)
{
 chomp ($pass);

 # scrypt:

 my $derived_key = scrypt_raw ($pass, $salt_bin, $n, $r, $p, $dklen);

 my $derived_key_cropped = substr ($derived_key, 16, 16);

 # SHA3 - keccak (needed for the "mac" check)

 my $mac_gen = keccak_256_hex ($derived_key_cropped . $ciphertext_bin);

 if ($mac_gen eq $mac)
 {
   print "Password found: '$pass'\n";
 }
}

how to run it:
Code:
echo testpassword | ./ethereum_scrypt.pl

(examples, as mentioned within the code, are from: cpp-ethereum/test/unittests/libdevcrypto/SecretStore.cpp)

Note: the code is in perl, but it wouldn't be impossible to add GPU support with hashcat, but we need to clarify a lot of things first, actually there is already a github issue here: https://github.com/hashcat/hashcat/issues/262 (with very little information about the algorithm)
#9
Well...
I looked at the problem a bit more in depth and now I'm aware (hope to be wrong) that I can use only a dictionary based attack. Also using ethercrack that has the "presale" option, that @philsmd gently proposed, has no way of solving my issue.


The key points are:
  • passphrase is between 23 and 25 chars 
  • passphrase is presumably created with 3 names (i.e. Camille, Ernest, Savannah)
  • the "a" can be "@" at least in the first position
  • the "e" can be "€" at least in the first position
  • the first letter of every name is capitalized
  • at the end of the passphrase can be a "*"

So the only way I found (please refute me!) is dictionary attack with a very big dictionary. 
So I can for sure pipe the dictionary I created into the .pl scripts that @philsmd wrote and I have only to test it.

If other infoes are needed please tell me.
Thanks


(04-07-2017, 02:12 PM)dindolo1979 Wrote: Wow !! Thank you.
Ok I'll collect info and give a better definition of the scenario as well as I can.
I'll try the perl code you wrote and try to contribute asap.

Thanks

(04-07-2017, 02:03 PM)philsmd Wrote: @dindolo1979 Attention: I would be very careful with this small set of information you provided/got here.

As far as I understood the algorithm is much more simple/straight forward and doesn't need any AES etc steps just to validate the password.

If you are really interested in some more (technical) discussions and/or if you want that these algorithms (actually yes, there are 2 different algorithms at least!) to be added to hashcat, we should continue to collect some information and maybe you can contribute a little bit (with some more info, e.g. which file a user normally has etc, I'm thinking about e.g. the ~/.web3/keys/ files on linux, but I'm not too familiar with ethereum)...

This is what I got so far, a POC:
pbkdf2:
Code:
#!/usr/bin/env perl

# author: philsmd (for hashcat)
# date: april 2017

use strict;
use warnings;

use Crypt::PBKDF2;
use Digest::Keccak qw (keccak_256_hex);

#
# Algorithm can be found in: SecretStore::decrypt () in cpp-ethereum/libdevcrypto/SecretStore.cpp
# Examples can be found in:  cpp-ethereum/test/unittests/libdevcrypto/SecretStore.cpp
#

my $mac = "cf6bfbcc77142a22c4a908784b4a16f1023a1d0e2aff404c20158fa4f1587177"; # the "hash"

my $ciphertext = "d69313b6470ac1942f75d72ebf8818a0d484ac78478a132ee081cd954d6bd7a9";

# pbkdf2 params:

my $dklen = 32;
my $c = 262144; # iterations
my $salt = "c82ef14476014cbf438081a42709e2ed";

# pass:

# my $pass = "bar";

#
# Start
#

my $salt_bin = pack ("H*", $salt);

my $ciphertext_bin = pack ("H*", $ciphertext);

while (my $pass = <>)
{
 chomp ($pass);

 # pbkdf2:
 
 my $pbkdf2 = Crypt::PBKDF2->new
 (
   hasher     => Crypt::PBKDF2->hasher_from_algorithm ('HMACSHA2', 256),
   iterations => $c,
   out_len => $dklen
 );
 
 my $derived_key = $pbkdf2->PBKDF2 ($salt_bin, $pass);

 my $derived_key_cropped = substr ($derived_key, 16, 16);

 # SHA3 - keccak (needed for the "mac" check)
 
 my $mac_gen = keccak_256_hex ($derived_key_cropped . $ciphertext_bin);
 
 if ($mac_gen eq $mac)
 {
   print "Password found: '$pass'\n";
 }
}

how to run it:
Code:
echo bar | ./ethereum_pbkdf2.pl

scrypt:
Code:
#!/usr/bin/env perl

# author: philsmd (for hashcat)
# date: april 2017

use strict;
use warnings;

use Crypt::ScryptKDF qw (scrypt_raw);
use Digest::Keccak   qw (keccak_256_hex);

#
# Algorithm can be found in: SecretStore::decrypt () in cpp-ethereum/libdevcrypto/SecretStore.cpp
# Examples can be found in:  cpp-ethereum/test/unittests/libdevcrypto/SecretStore.cpp
#

my $mac = "2103ac29920d71da29f15d75b4a16dbe95cfd7ff8faea1056c33131d846e3097"; # the "hash"

my $ciphertext = "d172bf743a674da9cdad04534d56926ef8358534d458fffccd4e6ad2fbde479c";

# scrypt params:

my $dklen = 32;
my $n = 262144;
my $p = 8,
my $r = 1,
my $salt = "ab0c7876052600dd703518d6fc3fe8984592145b591fc8fb5c6d43190334ba19";

# pass:

# my $pass = "testpassword";

#
# Start
#

my $salt_bin = pack ("H*", $salt);

my $ciphertext_bin = pack ("H*", $ciphertext);

while (my $pass = <>)
{
 chomp ($pass);

 # scrypt:

 my $derived_key = scrypt_raw ($pass, $salt_bin, $n, $r, $p, $dklen);

 my $derived_key_cropped = substr ($derived_key, 16, 16);

 # SHA3 - keccak (needed for the "mac" check)

 my $mac_gen = keccak_256_hex ($derived_key_cropped . $ciphertext_bin);

 if ($mac_gen eq $mac)
 {
   print "Password found: '$pass'\n";
 }
}

how to run it:
Code:
echo testpassword | ./ethereum_scrypt.pl

(examples, as mentioned within the code, are from: cpp-ethereum/test/unittests/libdevcrypto/SecretStore.cpp)

Note: the code is in perl, but it wouldn't be impossible to add GPU support with hashcat, but we need to clarify a lot of things first, actually there is already a github issue here: https://github.com/hashcat/hashcat/issues/262 (with very little information about the algorithm)
#10
Yeah, in this very specific situation it might be best to have a fast (standalone) password generator (a c file, perl script etc which only generates passwords according to your "rules").

According to your posts above, you already have generated a list of password candidates.
I'm just wondering why it is THAT huge (194 GB).

Are you sure that each password candidate within this huge dictionary file fits your rules?
Are there more than 3 words (lets say Camille, Ernest, Savannah) you want to try ?
194 GB seems to be a little bit too much for "just" 3 words!

It would also make sense to make some tests first, e.g. to create a new ethereum account (with known password, or even one with similar password) and test the perl script I provided.

Furthermore, what is also very important to test first:
1. check the speed (e.g. with a small set of password candidates, and profile it) and get a feeling how long it would take (is it feasible at all or does it take thousands of years o.O)
2. make sure that you use all your CPU power (e.g. use something like "cat myhugefile.txt | parallel --pipe ./ethereum_pbkdf2.pl"
3. it's also important which algorithm was used to generate the account/keys: Is it pbkdf2 or scrypt ? This might make a huge difference and you should probably change your strategy also depending on this.

I suggest to approach it like this:
1. first make sure that you either have a perfectly working password generator that doesn't generate any password candidates that shouldn't be tried (i.e. try to reduce the input from 194 GB to something more feasible) or the pre-generated word list (that meets all the rules)
2. make sure that you get a feeling about how long it would take (worst case), make sure that you know if scrypt or pbkdf2 is used as the main algorithm!
3. make sure that everything works on a test account
4. make sure that you understand what needs to be changed within the perl script (mac, ciphertext, and the scrypt params or pbkdf2 params) and that you modified everything correctly

As said, it might be possible to add this to hashcat. Well, scrypt is a little bit more GPU-unfriendly and it's not sure if it makes too much difference (compared to a very fast CPU cluster)... pbkdf2 is a different story.

Hope these thoughts help at least a little bit.

P.S. you mentioned 3 words, these specific 3 words concatenated together make up a length of 21. I'm not yet sure why you mention that the password is 23 to 25 characters long. Maybe the example words are not the real ones... that's not that important... but it's actually important how many words need to be tried etc... maybe you can explain this a little bit more (e.g. how many base words and how you end up with a lenght of 25, are there any separators between the words etc?)