08-21-2023, 11:29 AM
it all has to with possbile storage and combinations
it all started with ascii (8 bit, 1 byte) combinations = 255
take a look
https://www.asciitable.com/
you see ascii and extended ascii uses these 255 combinations so 1 byte is "full" it cannot encode more than this, no other chars possible, so utf-8 and other encodings came up, UTF-8 today can have a length 1-4 bytes to encode all the other possibilities.
https://www.utf8-chartable.de/unicode-utf8-table.pl
thats why äöü and other chars need 2 bytes for encoding, if you scroll down on the utf-8 chart table you will find some weird languages which uses 4 bytes, but this way utf-8 can encode every? language and chars on the world, at the end of the list there should be all emoticons and symbols
it all started with ascii (8 bit, 1 byte) combinations = 255
take a look
https://www.asciitable.com/
you see ascii and extended ascii uses these 255 combinations so 1 byte is "full" it cannot encode more than this, no other chars possible, so utf-8 and other encodings came up, UTF-8 today can have a length 1-4 bytes to encode all the other possibilities.
https://www.utf8-chartable.de/unicode-utf8-table.pl
thats why äöü and other chars need 2 bytes for encoding, if you scroll down on the utf-8 chart table you will find some weird languages which uses 4 bytes, but this way utf-8 can encode every? language and chars on the world, at the end of the list there should be all emoticons and symbols