O
O
Omniverse2016-03-25 01:04:28
Python
Omniverse, 2016-03-25 01:04:28

How to read file in utf-8 in python?

Hello.
I create a text.txt file with utf-8 encoded text.
In Python I write:

f = open("text.txt", "r", encoding="utf-8")
print(f.read())
f.close()

Error:
UnicodeEncodeError: 'charmap' codec can't encode character u'\ufeff' in position 0: character maps to
How to correctly read files in utf-8?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
V
Vladimir Kuts, 2016-03-25
@Omniverse

So why delete the BOM symbol? Why extra manual work? Read with him:

import io
import chardet
import os
import codecs

filename = 'test_file.txt'

bytes = min(32, os.path.getsize(filename))
raw = open(filename, 'rb').read(bytes)

if raw.startswith(codecs.BOM_UTF8):
    encoding = 'utf-8-sig'
else:
    result = chardet.detect(raw)
    encoding = result['encoding']

infile = io.open(filename, 'r', encoding=encoding)
data = infile.read()
infile.close()

print(data)

N
Nikon_NLG, 2016-03-25
@Nikon_NLG

# -*- coding: utf-8 -*-
Specified at the beginning of the code?

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question