T
T
Timebird2018-03-27 12:27:58
Python
Timebird, 2018-03-27 12:27:58

How to solve encoding problems with re.split?

Hello!
There is a .txt file with UTF-8 encoding and Cyrillic. Opens fine in jupyter (macOS).
I want to split it by tabs: I write f.split('/t'). I get something like:

xd0\xbc\xd0\xb0\xd0\xb3\xd0\xb0 \xd0\xbb\xd0\xb8\xd1\x81\xd1\x82\xd0\xbe\xd0\xb2\xd0\xb0\xd1\x8f

The division into words is correct, but the encoding is broken. How to fix?
Thanks in advance).

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Alex F, 2018-03-27
@delvin-fil

So:
"The division into words is correct, but the encoding is broken. How to fix it?"

with open('zzz.txt') as f:
  mylist = [line.split('\t') for line in f]
  print (mylist)


[Finished in 0.2s]

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question