R
R
Riateche2012-03-12 18:10:03
Unicode
Riateche, 2012-03-12 18:10:03

Mercurial spoils the Russian letter "Р" in files

We use it for collaboration in the mercurial project. The development environment is Qt Creator (although, in my opinion, this is not important). Files encoded in UTF-8. From time to time it turns out that in some files all occurrences of the Russian letter “P” (bytes D0 A0) are replaced with the wrong set of bytes D0 20. Play the sequence of actions leading to the problem until it works, the problem always appears suddenly. Developers use mercurial from different operating systems (Windows XP, Windows 7, Debian). Access to the repository on the server is carried out via ssh with authorization by keys. I suspect that the problem may be in one of the versions of mercurial. How to deal with this problem?

Upd.The problem turned out to be in Qt Creator. If you do a text replacement in it throughout the project, then it spoils the letter in files that were not open at the time of the replacement. I will study why he does this.

Answer the question

In order to leave comments, you need to log in

1 answer(s)
K
Konstantin Vlasov, 2012-03-12
@CaptainFlint

There is not enough information, it is not clear where exactly the replacement takes place: in local files, in remote files, in the repository? If in the repository, then we look at whose commit, and already deal with its specific machine.
General direction for research: A0 is a non-breaking space, and some text processing applications turn it into a regular space. Most likely, some of the middleware involved in the data transfer chain interprets the text as ANSI, and therefore treats A0 as an independent character, and not as part of a UTF-8 sequence. Accordingly, you need to watch which programs are involved in the forwarding process, and try to track the content of the text at the inputs and outputs.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question