Q
Q
Qubc2017-12-26 21:45:05
Character encoding
Qubc, 2017-12-26 21:45:05

How does Notepad determine the encoding of a file?

I create an empty file in far without an extension with ansi 1251 or utf8 encoding.
I open it with notepad, click "save as" and see that notepad already offers ansi or utf, respectively.
How does this happen?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
VoidVolker, 2017-12-26
@Qubc

Far, when saving a file in UTF-8, writes a marker (EF BB BF) at the beginning of the BOM file, indicating that all text is further encoded using UTF-8 encoding. And when saving in ANSI - writes nothing. UTF8 may not contain a BOM - in which case it is up to the editor and/or user to determine the encoding. Some editors, analyzing a string, are able to determine the correct encoding with a tangible degree of probability.

E
Ezhyg, 2017-12-26
@Ezhyg

Maybe the file is not completely empty?
An empty file is just 0 bytes, with no streams and absolutely no information, not even encoding. That is, it contains absolutely no information at all, at all.
I just checked - I created an empty one, "changed" the encoding to UTF-8 + BOM (because notepad only knows how to use BOM), saved it - voila, the file has become 3 bytes - tell me what's in these 3 bytes?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question