Password generation from HDD
#1
I'm currently working on cracking a sha512crypt hash. I've got an image of the hard disk it came from so I'm trying to initally concentrate on generating word lists from the disk image.

Other than simply using 'strings' across the image (which I've already done), I'd be interested if anyone uses a more intelligent methods of generating keyword lists from disk images.

My next plan was to try and generate a wordlist from the mounted filesystem with all compound files expanded, which I suspect will give me a more accurate wordlist, capturing words which fall across sector boundaries, inside archives etc.

In this instance, the file system of the target device if EXT4, but I'd also be interested in tools for NTFS based systems or tools for generating keyword lists from specific system areas such as pagefile/hyberfile (e.g. using tools such as violatility to structure the data prior to extraction)

Any comments of advice would be appreciated
Cheers
D
#2
I never thought about creating a wordlist this way. In theory you should be able to do "strings /dev/sda", that should work and if there's hidden data you should get it with this as well...
#3
(02-13-2014, 07:58 PM)atom Wrote: I never thought about creating a wordlist this way. In theory you should be able to do "strings /dev/sda", that should work and if there's hidden data you should get it with this as well...
Running strings against /dev/sda will run across the disk image sectors sequentially from start to finish which will give a good word list and include words from the deleted areas of the disk (unallocated space), but for fragmented files a very small number of words within those files will not be continuous across the sector boundaries and be missed.

I guess you could mount the partition and extract strings from the live files this way, which will resolve the fragmentation issue:
find . -xdev -type f -print0 | xargs -0 strings | sort -u

Its also worth remembering that by default strings will only pull out ASCII strings, so some tweaking will be needed to get unicode/base64 and other encoded strings.

Even then, this still leaves the issue of strings inside compound files such as zips (inc docx files), email archives (pst,dbx etc) and other proprietary data formats which will be missed.

I'm going to have a look at some of the opensource forensic tools to see how they approach this problem from a keyword searching perspective .

Cheers
d
#4
this seems like a really poor approach to crack sha512crypt. i would be hesitant to even use this approach with fast hashes. you will end up with all sorts of garbage and bullshit in your "wordlist" at the end of this process.