Answer the question
In order to leave comments, you need to log in
Problem with encoding in requests_html?
Unable to parse site header in correct encoding.
>>> from requests_html import HTMLSession
>>> session = HTMLSession()
>>> r = session.get('https://pm.by/live.html')
>>> print(r.encoding)
WINDOWS-1251
>>> r.html.xpath('//title/text()')
['������ Live � ������ �� ����� ���� (�� ���� �����): �� ��������']
>>> r.html.xpath('//title/text()')[0].encode('cp1251')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.8/encodings/cp1251.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-5: character maps to <undefined>
Answer the question
In order to leave comments, you need to log in
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question