I
I
Ilya2015-04-06 16:04:53
Python
Ilya, 2015-04-06 16:04:53

How to use unicode in python2.7?

Hello. Always used python3 (normal work with unicode is just one of the reasons).
Now I want to use the sccrapy library, but it is only for the 2nd branch.
I figured out the library itself more or less, but since I need to parse the site with Cyrillic, I get this

>>>s
u'\u043d\u0430'
>>>s.encode('utf-8')
'\xd0\xbd\xd0\xb0'

Take a closer look, the site says:
<meta http-equiv="Content-Type" content="text/html; charset=windows-1251" />

But
>>>s.encode('windows-1251')
'\xed\xe0'

Accordingly, it did not help either.
Actually where to dig, how to deal with it?

Answer the question

In order to leave comments, you need to log in

1 answer(s)
B
Bulat Kurbangaliev, 2015-04-11
@ilov3

u'\u043d\u0430' is the Unicode representation of the string (everything that starts with u'')
The encode method allows you to encode Unicode into the encoding we need, for example, to write to a file, etc.
At one time, this article helped a lot: habrahabr.ru/post/135913

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question