Answer the question
In order to leave comments, you need to log in
How to decode unicode escaped strings?
When parsing with BeautifulSoup, sometimes I get lines containing characters like \uxxxx
For example:
element.text
>>> 'Плотность пленки \u2013 10 мкн'
element.text.decode('unicode_escape')
AttributeError: 'str' object has no attribute 'decode'
Answer the question
In order to leave comments, you need to log in
First, in the console you see a string decoded to match the console encoding. This is not the same as how Python sees it from the inside (and it sees it most likely in unicode). Specifically, judging by the text, you have some kind of character encoded (an em dash, perhaps), which is not in the console encoding, so it displays its type.
Second, try searching for a substring inside your application to check that Python is processing it correctly (for example, find this character).
Thirdly, you have an error that the str type does not have a decode method, so try this:str(element.text).decode('unicode_escape')
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question