encoding questions ...
#1
ok im very noob in encoding,need help to understand the basic  .

lets take the password "pass123" in txt file pass.txt

1- if the password/hash is made on -a US computer ill put the word pass123 in pass.txt encoded in UTF-8 ? correct ?
2- if  the password/hash "pass123"  is made on a russian computer with windows-1251 as default encoding ,should i put pass123 in a pass.txt encoded in windows-1251 ? correct ?  or i remain in utf-8 ?

i use notepad++ and on windows

thx for your time
Reply
#2
UTF-8 uses one to four bytes to encode characters.
UTF-8 was designed for backward compatibility with ASCII
the first 128 characters are encoded using a single byte (8-bit) with the same binary value as ASCII
https://en.wikipedia.org/wiki/UTF-8
https://en.wikipedia.org/wiki/ASCII

Windows-1251 uses one byte (8-bit) to encode characters.
the first 128 characters have the same binary value as ASCII
https://www.ascii-code.com/CP1251

pass123 all these characters are ASCII

Thus, a file with these characters must be 7 bytes in size
regardless of whether it is saved in ASCII or UTF-8 or Windows-1251 format.
Reply
#3
(11-24-2023, 02:06 PM)v71221 Wrote: UTF-8 uses one to four bytes to encode characters.
UTF-8 was designed for backward compatibility with ASCII
the first 128 characters are encoded using a single byte (8-bit) with the same binary value as ASCII
https://en.wikipedia.org/wiki/UTF-8
https://en.wikipedia.org/wiki/ASCII

Windows-1251 uses one byte (8-bit) to encode characters.
the first 128 characters have the same binary value as ASCII
https://www.ascii-code.com/CP1251

pass123 all these characters are ASCII

Thus, a file with these characters must be 7 bytes in size
regardless of whether it is saved in ASCII or UTF-8 or Windows-1251 format.

ok thx for the explanation ,so if i understand well its utf-8  encoding in notepad ++ ? even if computer was cp-1251
Reply
#4
standard encoding for new files in np++ is UTF-8, you can change the encoding of the file to whatever you want

other widly used should be

plain ascii/ansi
windows-1252
iso-8859-1
iso-8859-15
Reply