hcstatgen not making 32.1MB files
yeah, that's it. what you sent me in pm was \x95\x66\x1a so it looks like you were just missing the fourth byte there. 0x0a is the newline character.
i wonder if it's actually supposed to be \u9566\u1A0D

edit: yes, i do believe that's what it is, two chinese or thai or some asiany characters encoded as utf16le.

printf '\x95\x66\x1A\x0D' | iconv -f utf16le
I hope you are not asking me because I have no idea about anything beyond my first post on this thread ! Big Grin
naw, just trying to get all the information necessary to reproduce.
(01-30-2013, 02:28 AM)epixoip Wrote: naw, just trying to get all the information necessary to reproduce.

If you want to reproduce it better I would use the latest beta as it tells you which line that character is on. The current hcstatgen.exe 0.9 just finishes quickly.
Please add a ticket for this
(01-30-2013, 12:01 PM)atom Wrote: Please add a ticket for this

Done, with link back to this thread.

epixoip have you been able to reproduce it ?
(01-30-2013, 02:20 AM)epixoip Wrote: æš•à´š

that does not look like it is meant to be utf16. While the first character is chinese the second is not. Looks more like arabic.
it may not be utf16, but the second character is definitely not arabic. unicode table states that it's a buginese character. the first character is listed as C/J/K (chinese, japanese, korean) since a lot of asian languages use chinese characters for words, even if they do not speak chinese. the bugis are very closely related to the chinese, so it's not implausible that some bugis use chinese characters as well.
note that none of this is strictly relevant to reproducing the bug since we have the byte stream Smile