S
S
sddvxd2019-05-20 13:35:14
ASCII
sddvxd, 2019-05-20 13:35:14

Why do ascii and utf-8 texts have different lengths?

Good afternoon!
There is a binary file in ansi encoding, I copy it to a file with utf-8 encoding, the first file is 7168 bytes in size, and the second (where I copied the text) becomes 7,736 bytes in size. The same quantity and order gave for some reason a different size. Please explain why

Answer the question

In order to leave comments, you need to log in

2 answer(s)
S
SagePtr, 2019-05-20
@SagePtr

Binary? Recode? Don't expect it to remain binary after that.

V
Vladimir Dubrovin, 2019-05-20
@z3apa3a

In UTF-8, characters corresponding to ASCII are encoded in one octet, but characters of other code pages are encoded in a different number of octets (from 2 to 4x at present), for example, Cyrillic characters or characters specific to European languages ​​are encoded in 2 octets.
In ANSI encodings, characters are always encoded as one octet, and an ANSI encoding can contain more than just ASCII characters.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question