Site parsing produces Cyrillic text in the format '\u0410', how to convert it to Cyrillic?

B

bugagashnik2018-03-13 13:59:55

Python

bugagashnik, 2018-03-13 13:59:55

I pulled out the element (text), in the console it outputs in the format '\u0410\u0443\u043c\u0435\u043d\u044f \u0449\u0430\u0441 \u0432\u043e\u0442 \u0442\u0430\u043a\u043e\u0435 \u0447\u0447 \u043e \u0441\u0442\u043e\u0438\u0442 \u043f\u043e\u0434 \u043e\u043a\u043d\u0430\u043c\u0438!'. How to convert to Cyrillic? And what is the format?

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

A

Andy_U, 2018-03-13
@bugagashnik

Here is the code in Python 3.6:

s = r'\u0410 \u0443 \u043c\u0435\u043d\u044f \u0449\u0430\u0441 \u0432\u043e\u0442 \u0442\u0430\u043a\u043e\u0435 \u0447\u043c\u043e \u0441\u0442\u043e\u0438\u0442 \u043f\u043e\u0434 \u043e\u043a\u043d\u0430\u043c\u0438!'
print(s.encode('ascii').decode('unicode_escape'))

Outputs to the console:
Only you had an error earlier - somewhere you decoded a byte message into 'ascii' in vain, but you should have into the encoding that is in the page header.