Posts: 10
Threads: 4
Joined: Apr 2020
Hello,
I'm currently trying to write some custom mangling rules that involve non-ASCII characters, specifically focusing on the substitution ('s') command. However, it does not seem to work. For example, when I try to use a rule like "saą", I get an "unsupported rule" error (version 6.2.6).
Could you please clarify whether I am formatting the rule incorrectly, or if unicode characters are simply not supported in this context?
Thank you.
Posts: 407
Threads: 2
Joined: Dec 2015
If those nonascii characters are more than 1 byte when UTF8 encoded, then this will not work currently as the s rule operator only accepts single byte substitutions.
Posts: 10
Threads: 4
Joined: Apr 2020
(01-01-2024, 09:50 PM)Chick3nman Wrote: If those nonascii characters are more than 1 byte when UTF8 encoded, then this will not work currently as the s rule operator only accepts single byte substitutions.
I see, thanks for the prompt response.
I have only limited experience dealing with Hashcat's source code. If one would want to enable byte-to-bytes substitutions, what do you think would be the amount of work required?
Please, use this scale:
That's easy.
It would take a week.
Impossible, you're going to break everything.
Thanks.
Posts: 167
Threads: 2
Joined: Apr 2021
01-02-2024, 11:35 AM
(This post was last modified: 01-02-2024, 11:35 AM by penguinkeeper.)
This is something that has come up regularly for years. Unfortunately, it'd require a quite large refactor of the current (very complex) ruleprocessor. As you can imagine, it has to be able to generate upwards hundreds of billions of candidates per second so it has to be as efficient as possible. Enabling multibyte support would slow this down by an unknown amount so the only problem isn't dev time, it's also computational efficiency. There are some rules that you can already use with multibyte like if you had µ , that is c2 b5 in hexadecimal notation, so the rule you want would be "$\xc2 $\xb5". Example:
Code:
$ echo "" | ./hashcat.exe --stdout -j '$\xc2 $\xb5'
µ
Posts: 10
Threads: 4
Joined: Apr 2020
(01-02-2024, 11:35 AM)penguinkeeper Wrote: This is something that has come up regularly for years. Unfortunately, it'd require a quite large refactor of the current (very complex) ruleprocessor. As you can imagine, it has to be able to generate upwards hundreds of billions of candidates per second so it has to be as efficient as possible. Enabling multibyte support would slow this down by an unknown amount so the only problem isn't dev time, it's also computational efficiency. There are some rules that you can already use with multibyte like if you had µ , that is c2 b5 in hexadecimal notation, so the rule you want would be "$\xc2 $\xb5". Example:
Code:
$ echo "" | ./hashcat.exe --stdout -j '$\xc2 $\xb5'
µ
got it, thanks.
There is no similar trick for 's', right? At least for "s 1byte 2bytes". I guess, we can do it only for "s nbytes nbytes".
Posts: 890
Threads: 15
Joined: Sep 2017
yes, there is no trick for substitude
i think the only workaround would be to do the substitude by yourself with a simple script (python whatsoever) and your input dictionary beforehand
Posts: 10
Threads: 4
Joined: Apr 2020
01-02-2024, 04:50 PM
(01-02-2024, 04:33 PM)Snoopy Wrote: yes, there is no trick for substitude
i think the only workaround would be to do the substitude by yourself with a simple script (python whatsoever) and your input dictionary beforehand
Ack. Yes, that seems the only viable option at this point.
However, briefly returning to the options of working on the source code, do you think that adding a new command (i.e., a substitution that works on arbitrary chars) rather than modifying the existing 's' would be any easier? This, ignoring the possible slowdown. (I would call it 'ș' )
Posts: 167
Threads: 2
Joined: Apr 2021
It still goes against a lot of the core of the ruleprocessor and every one of the rules would have to be refactored and re-tested, certainly not an easy feat. Possible but not easy or fast. As previously mentioned, you can use a custom script or a pre-made thing like RuleProcessorY (
https://github.com/0xVavaldi/ruleprocessorY) which is an external tool that supports multibyte rules already.
Couple of related bits of source code:
https://github.com/hashcat/hashcat/blob/.../inc_rp.cl
https://github.com/hashcat/hashcat/blob/...timized.cl
Posts: 10
Threads: 4
Joined: Apr 2020
(01-02-2024, 05:25 PM)penguinkeeper Wrote: It still goes against a lot of the core of the ruleprocessor and every one of the rules would have to be refactored and re-tested, certainly not an easy feat. Possible but not easy or fast. As previously mentioned, you can use a custom script or a pre-made thing like RuleProcessorY (https://github.com/0xVavaldi/ruleprocessorY) which is an external tool that supports multibyte rules already.
Couple of related bits of source code:
https://github.com/hashcat/hashcat/blob/.../inc_rp.cl
https://github.com/hashcat/hashcat/blob/...timized.cl
I see, thanks.
At the end, I went with pre-processing the wordlist.
Thanks again!