Idea: text-processor for pw-candidates
Hi folks
I'm kinda newbie but managed to recover some of my lost password of some zip-files with hascat and prince-processor, which was cool.
I have some basic understanding of coding but too few experience (and actually no time) to code this idea myself, so I just want to share it with you.
Probably someone has already done it but I couldn't find it the forum or at hascat-repos.

So here it goes:
Tool to derive password candidates/dictionary from input-texts.
A common "algorithm" for obtaining random look-alike passwords which are good to remember by humans is to take the the first letter(s) of a memorizable sentence, poem, songtext or whatever "chain of words".

As examples, lets take the following part of a song text:
"Master of puppets I'm pulling your strings
Twisting your mind and smashing your dreams"

A possible password candidate would then be "MopIpysTymasyd", on which all the hascat rules for upper/lower case and other substitution could be applied as well.

So I'm thinking of a tool with the following functionality:
  1. take a file which contains chains-of-words (aka sentences; but would not rely on punctuation or lines, like example given above)
  2. put each word into an <array> or <list>; keep the order of the words, don't strip multiple occurence of the same word, as it changes the logic of the sentences
    -- maybe put the interpunctuation signs as single letter word in the array too
  3.  set the number of chars to be taken from each word (n-char=#)
    -- e.g. above it was 1 (the first) but could also be set to 2 (take always the first two) etc
    -> 2: "MaofpuI'puyostTwyomiansmyodr"
    -- if a word contains less then the number set, the whole word is taken (without triggering an error) and just proceed with the next word in the list.
    -- One could also think of a syntax of n-char=1,2 > take 1 letter beginning from position 2 of each word (default would be n-char=1,1);
    ---n=1,2: "afu'uotwoinmor"
    -- define how short cut-words should be interpreted
    e.g. I'm >> 1 word "I'm" and/or 2 word "I am" etc ( don't >> "don't" and "do not")
  4.  set the lenght of the password candidates, from "n" to "m"; (min-pw-length=#) / (max-pw-length=#)
    -- maybe a lenght of 1 makes no sense as it is just feeding the charset once, so starting min-pw-lenght with 2 by default?
    -- if max-pw-length is not given, then it iterates until <max-pw-length == number of word in the infile>
  5.  set the min- and max-chain-length to take (("moving-window-width") or  (-min-chain-length=#) / (-max-chain-length=#)
    -- set the moving window to a certain width, e.g. process only a certain number of consecuting words)
    -- if not set, start with each word on its own, then take always 2 words, then 3 etc. until the whole chain in full is taken
  6. maybe also an obtion to invert the direction of the procession (start with the last word then second last etc.): -invert-chain=0/1/2 (default=0=no; 1=yes; 2=both directions)
    -- e.g.: sypIpoM (first line, one word, one char, inverse word direction)
  7. also invert direction of the chars (take n-chars from the end of each word); (-invert-char=0/1/2)
    -- e.g. rfsmgrs (first line, one word, one char, inverse char-direction)
    -- e.g. srgmsfr (first line, one word, one char, both invertet direction)
  8. option to store the candidates in an outfile
  9. feed the candidates to hashcat (stdout)
  10. let hashcat / princeprocessor / maskprocessor etc apply further rules to the candidates, for changing letter-case, substitute letters etc.

E.g. a programm call then could look like:
  • text-processor -i <infile> -o <outfile> -nchar=2,1 -max-pw-length=16 -max-chain-length=7

-- as two chars from each word and max 7 words are processed, the -max-pw-length=16 will no be met, as the max-pw would be 2*7=14
-- with -max-chain-length=9, then -max-pw-length=16 would limit the output and make the chain length obsolete
--- so guess only one max-parameter should be used at the programm call or the parameter that is first met dictates the other...

and produce the following candidates, if the example above is in the <infile>

Quote:# chain=1, char=2,1

# chain=2, char=2,1

# chain=3, char=2,1

# chain=7, char=2,1

It's somehow what could be produces with prince-processor too, but Prince normaly is fed with a list of unique words so it may take a long time until the same word is chained multiple times, and of course the chain from prince doesn't have the logic of the words-order as there is in sentence understand by humans.

I'm not sure which iteration configuration would be more efficient, first process all chars for a given chain-length or vice versa.

What do you think?
Is there already such a tool out there?