Answer the question
In order to leave comments, you need to log in
What to do with encoding in Telegram bot?
I'm making a telegram bot. The data is received in json, encoded in cp1251 and converted to utf-8. When sending text, "Aukštųjų" becomes "Auk", and with urlencode it becomes "Aukštųjų". If you just output the received data to the browser, copy it and send it manually, from a local server or via heroku, then everything works. And when the data is taken from outside, it is truncated when sent to the first unusual character. Why is this happening and what to do?
$json= iconv('cp1251', 'utf-8', $json);
$json= json_decode($json, 1);
$text = $json[0]['text']; // $text = Aukštųjų
$result = $url . '&text=Aukštųjų'; // Результат: Aukštųjų
$result = $url . '&text=' . $text; // Результат: Auk
$result = $url . '&text=' . urlencode($text); // Результат: Aukštųjų
Answer the question
In order to leave comments, you need to log in
You have a mistake on a mistake.
1) JSON must be in UTF-8. And you write that the data is received (from where is it received?) not in Unicode, but in eight-bit win 1251. It should not be so.
2) You believe the contents of the header where win 1251 is written, but this is a lie, because this encoding is Cyrillic, it cannot transmit the characters š and ų - see for yourself: https://en.wikipedia.org/wiki/Windows-1251
3) Because (2), the code iconv('cp1251', 'utf-8', $json)
doesn't make sense. If your JSON is indeed transmitted in 8-bit encoding (which is an error in itself, see point 1), then it is either ISO 8859-10 or ISO 8859-4 .
I can't tell you exactly how to fix the situation because there isn't enough data in the question - there are too many bugs stacked on top of each other. Obviously, you need to start solving the problem from the beginning, that is, from finding out what encoding the JSON is actually transmitted in (not looking at the HTTP header, but looking at the data itself).
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question