V
V
Viktor2016-05-25 23:51:42
Python
Viktor, 2016-05-25 23:51:42

How to output russian characters in python file?

There is a lightweight Python 2 program that goes through one file with Russian words and adds them to a collection, then this collection is written to a json file

f = open("in.txt")
conversations = open("data/russian/conversations.json", "wb")
ar = {"conversations": []}

worker = []
for line in f:
    if line == "-----":
        ar["conversations"].append(worker)
        worker = []
    else:
        worker.append(line.strip())
print ar["conversations"][0][0]
conversations.write(str(ar))
conversations.close()

As a result, in the json file I have something like this:
{'conversations': [['\xd0\x9f\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82!', '\xd0\x97\xd0\xb4\xd1\x80\xd0 \xb0\xd0\xb2\xd1\x81\xd1\x82\xd0\xb2\xd1\x83\xd0\xb9!', // and so on

The question is: how to correctly output Russian characters to a file? (.encode("utf-8") tried, did not help)

Answer the question

In order to leave comments, you need to log in

[[+comments_count]] answer(s)
A
abcd0x00, 2016-05-26
@abcd0x00

import codecs
import json

with codecs.open('file.txt', 'w', encoding='utf-8') as fout:
    json.dump({u'абв': u'где'}, fout, ensure_ascii=False)

[[email protected] py]$ cat file.txt 
{"абв": "где"}[[email protected] py]$

V
Vladimir Kuts, 2016-05-26
@fox_12

Try something like this:

# -*- codecs: utf-8 -*-
import codecs

file = codecs.open("somefile", "w", "utf-8")
file.write(u'какая-то строка')
file.close()

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question