O
O
OldJohn2022-03-18 10:52:09
Character encoding
OldJohn, 2022-03-18 10:52:09

Java encodings from windows-1251 to utf-8?

Why are English letters encoded and decoded in windows-1251 and utf8 correctly, respectively, but Russian characters are not?
After all, if both encodings use Unicode, then in both encodings both Latin and Cyrillic correspond to the same codepoints.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
M
Michael, 2022-03-18
@OldJohn

windows-1251 does not use unicode. This is an old encoding, from the time when unicode did not even exist (along with cp866, koi-8r and others)
Latin works because the code tables for Latin in ASCII, windows-1251 and utf-8 are the same (that is, one and the same character is encoded by the same number from the range 0-127)
But for Cyrillic - no, the numbers encoding the Cyrillic character in windows-1251 and in utf-8 are different (more precisely, in windows-1251 it will be one number, and in utf-8 - a pair of numbers)

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question