Copy and reuse dictionary cache - Printable Version +- hashcat Forum (https://hashcat.net/forum) +-- Forum: Support (https://hashcat.net/forum/forum-3.html) +--- Forum: hashcat (https://hashcat.net/forum/forum-45.html) +--- Thread: Copy and reuse dictionary cache (/thread-9374.html) |
Copy and reuse dictionary cache - fa1rid - 07-10-2020 Hi, I have a dictionary of size 100GB. Every time I run hashcat I need to wait at least 15 mins to build the cache of this dictionary. How can I copy this cache from machine to machine to avoid building the cache every time? Where is the path of this cache? Kind Regards. RE: Copy and reuse dictionary cache - royce - 07-10-2020 That's ... a big wordlist. This isn't a direct answer to your question, but you might consider: - Splitting your dictionary into multiple chunks, using the `split` command on Unix-likes - If the wordlist is a mashup from multiple sources, considering running them individually RE: Copy and reuse dictionary cache - fa1rid - 07-10-2020 There are even bigger word-lists out there. This is my own optimized mashup. I already did that split it to 4 parts, but still slow. You completely ignored my question, where is this cache stored? Or maybe it's not stored, it's just in memory? There must be a solution for this real problem. RE: Copy and reuse dictionary cache - royce - 07-10-2020 My having explicitly said "This isn't a direct answer to your question" isn't exactly "completely ignoring" your question, yes? The canonical solution to this problem is to not do what you're doing. Just because there are lists bigger than 100GB doesn't mean that it's a good practice. This may not be the advice you're looking for, but it may be the advice you need. Mashing up all of your lists into a single list is rarely necessary, and has no inherent efficiency gain. Multiple lists can be specified, or an entire directory name can be specified, on the hashcat command line. If the purpose of your 100GB wordlist was deduplication, it is not necessary to do this via a single massive wordlist (and less efficient than the alternatives, such as using rli from the hashcat-utils suite to deduplicate across multiple wordlists) If the purpose of your 100GB wordlist is to optimize attack order, simply split the file into smaller chunks, and supply them to hashcat in order on the command line. The end result will be identical, but the dictionary cache building cost will be distributed across the number of chunks. If the wait time is larger than desired, increase the number of chunks. But if you wish to persist in mashing up your wordlists, I'm not aware of a way to automatically distribute dictionary caches across installations. You could experiment with copying the file yourself, but I'm not sure how effective that will be. On Linux, the dictstat2 file is in ~/.hashcat/. Wherever the default Windows hashcat directory is, that's where it will be on Windows. RE: Copy and reuse dictionary cache - fa1rid - 07-10-2020 Thanks for replying and answering my question. I appreciate your help Yes my purpose is de-duplication and then splitting, but even with 4 splits it's still slow, so I need to make more chunks as you said. I wasn't aware of rli tool, I used to use sort command with -u -m parameters to remove duplicates. So rli will be more efficient because I don't need to rewrite the whole file as sort command does, right? Regarding the cache file, dictstat2, if I use the same machine (same hardware always) will be fine? I think you mean that you are worried about the caching mechanism, that maybe it's affected by hardware type? Thanks again! RE: Copy and reuse dictionary cache - fa1rid - 07-10-2020 I forgot to tell you that I use sort --parallel= in order to utilize all cpu cores and make the operation faster. Is rli multi-threaded? RE: Copy and reuse dictionary cache - royce - 07-10-2020 rli is for deduplication *across files* - see this example: https://hashcat.net/wiki/doku.php?id=hashcat_utils#rli If you use 'split', you don't have to re-sort. Just use 'split' to take your existing 4 files and split each of them into 2 or 3 pieces. I'm not sure if dictionary stats change based on hardware. But I do know (just from experience) that they have to be updated depending on the attack type. So the same dictionary may need to be updated more than once, if the attack changes. RE: Copy and reuse dictionary cache - fa1rid - 08-01-2020 Thanks royce! Which is faster rli or rli2? RE: Copy and reuse dictionary cache - royce - 08-01-2020 rli2 is definitely faster - once you've paid the initial cost of the sorting of the input files first. but it only takes one file to be removed as input. there's also a new project 'rling' in progress (https://github.com/Cynosureprime/rling) that has some knobs to customize trade-offs. it's already useful, but still actively being debugged and modified, so YMMV, and support for it is probably off-topic here. RE: Copy and reuse dictionary cache - fa1rid - 08-01-2020 It's unfortunate the rli2 doesn't support multiple files. A script needs to be made then. Regarding the topic of this thread, I found that there's an option: Code: --markov-hcstat2 | File | Specify hcstat2 file to use | --markov-hcstat2=my.hcstat2 I don't understand what does this option do and if it's related to caching the dictionary. ------------------------------------------ Also, one question regarding split please. I use the following command to split without breaking lines but it still breaks lines and files are even not equal or close in size. Code: # split file into 3 chunks I'm using debian 10 and the docs says: Code: l/N split into N files without splitting lines/records Code: Usage: split [OPTION]... [FILE [PREFIX]] |