07-14-2023, 02:46 PM
first, dont open up new threads for questions which are popping up after recieving an answer in another thread, use the old one
second, one possible answer is already given by sort
never the less, there is another great linux tool called iconv
this will strip unprintable chars from input, but never the less, it seems your input files are malformed or have been through some seriuos misconversion between different character encodings which will mostly result in these problems you mentioned
second, one possible answer is already given by sort
Code:
Set LC_ALL='C' to work around the problem.
never the less, there is another great linux tool called iconv
Code:
iconv -c input.txt > output.txt
or
iconv -c -t=UTF-8 input.txt > output.txt
this will strip unprintable chars from input, but never the less, it seems your input files are malformed or have been through some seriuos misconversion between different character encodings which will mostly result in these problems you mentioned