Sorting utf-8 wordlists - Printable Version +- hashcat Forum (https://hashcat.net/forum) +-- Forum: Misc (https://hashcat.net/forum/forum-15.html) +--- Forum: General Talk (https://hashcat.net/forum/forum-33.html) +--- Thread: Sorting utf-8 wordlists (/thread-1278.html) |
Sorting utf-8 wordlists - fizikalac - 06-11-2012 Hi! On my Ubuntu VPS server, the locale is set to en_US.utf8, but when I use sort command on a custom language utf-8 character wordlist, all speacial characters like Ä get converted to c. It looks like a collation issue. What settings do I have to apply for this to work? Do I have to install and change my locale? That would be really bad. I tried to find a solution on Google but without success. Thanks! RE: Sorting utf-8 wordlists - undeath - 06-12-2012 how does the sort command you run look like? RE: Sorting utf-8 wordlists - fizikalac - 06-12-2012 (06-12-2012, 01:16 AM)undeath Wrote: how does the sort command you run look like? It is the standard unix sort. I run it like this: cat wordlist.txt | sort -u > sorted.txt RE: Sorting utf-8 wordlists - undeath - 06-12-2012 cannot confirm. Code: [ undeath@p4home: /tmp ] % ~> cat test RE: Sorting utf-8 wordlists - fizikalac - 06-13-2012 Strange, I guess it's all about locale... I will post again if I encounter such problems. RE: Sorting utf-8 wordlists - NeonFlash - 11-15-2012 did you find a solution to this? can you extract 10 example lines from your wordlist (which contain accents, umlauts, and other utf-8 unicode characters), run the commands as undeath has done and post the output here? then, we can test the same on our *nix systems RE: Sorting utf-8 wordlists - epixoip - 11-15-2012 please do not revive dead threads. RE: Sorting utf-8 wordlists - NeonFlash - 11-15-2012 Just wanted to know the solution and have some discussion around it. Point noted, thank you. |