A
A
Alexander Pankov2018-07-26 19:04:58
PHP
Alexander Pankov, 2018-07-26 19:04:58

How to find out and fix php encoding?

hello, I have a file, I download it via curl_exec
when I open it in sublime, I see
this ; Ðîññèÿ; ì2 ;;; 1; 52; Êåðàìîãðàíèò; 000002886 ;;; Äëÿ ïîëà, Îáùåñòâåííûå ïîìåùåíèÿ, Ñòðîèòåëüíàÿ ïëèòêà; Áàçîâàÿ ïëèòà; øòóêà; 60; 60; 0.95 ;;; 4, 4, 1.44, 1.44, 31.8, 31,795 ; 60.5 bow60.5, 60.5, 60.5; 7.11; 11; 11.11;
this is a line from a file (uploading to csv), but when I read this file, I see rhombuses instead of Russian letters in debugging
. How can I fix the contents of the file so that it is in Russian?
Interesting:
there is a line in the sublime above, but I do Reopen With Encoding Windows-2151 and I see an excellent file
01 Tile;x9999069423;;Genesis dark gray K-108/SR (2q108/SR) 600x600x10 matte;1310;1310;Kerranova;Russia; m2;;;1;52;Porcelain stoneware;000002886;;;For flooring, Public premises, Construction tiles;Base plate;piece;60;60;0.95;;;4, 4;1.44, 1.44;31.8, 31.795;60.5x5x60 .5, 60.5x5x60.5;7.11;Marble and granite;Gray;Unglazed;Matte;Yes;No;;;;;;;Ceramic;Floor;;;;;;;;;;;;;;
But the most interesting thing is, if I click Save With Encoding 1251, then the sublime does not allow me to do this, it says that not all characters can be redone,
please help: how to understand the current encoding and what to do in php so that the contents of the file are read correctly (without roibics and question marks)?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
M
Moskus, 2018-07-26
@PankovAlxndr

Nothing "interesting" at all: your file is a CSV in Windows-1251 encoding, which in the first example is shown in Windows-1252 or ISO-8859-1 encoding (which is what you already have for Sublime set to single-byte encodings by default), that is is misinterpreted. Single-byte, not multi-byte Unicode - because the number of characters is the same and there are no identical characters repeated every other time (high byte).
You don't need to "fix" anything. It can be converted as a single-byte 1251 to UTF-8 - then the ambiguity in the interpretation of the encoding will disappear.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question