Answer the question
In order to leave comments, you need to log in
UTF-8 - what is the 0 after the ones in the leading byte for?
If a character is encoded as a single byte, the most significant bit is set to 0 for ASCII compatibility. If a character is encoded in 2-4 bytes, then in the leading byte 2-4 high bits take on the value 1, and after them comes 0. What is 0 for, if theoretically units are enough to determine the boundaries of the character?
Answer the question
In order to leave comments, you need to log in
What is 0 for, if theoretically units are enough to define the boundaries of a character?
11110001 10xxxxxx
110001xxxxxx
? Or is it three bytes encoding a character 1001xxxxxx...
? Or four, symbol 001xxxxxx...
? Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question