Answer the question
In order to leave comments, you need to log in
How to defeat UnicodeDecodeError when reading a file in pythone?
There is a 1.2 GB text file, there is a python script that reads it line by line.
logFile = open(sys.argv[1], 'r')
count = 0;
for log in logFile:
print(count) #номер обрабатываемой строки.
count += 1
...
File "./parcer.py", line 75, in <module>
for log in logFile:
File "/usr/lib/python3.6/codecs.py", line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 867: invalid continuation byte
Answer the question
In order to leave comments, you need to log in
with open(sys.argv[1], 'rb') as f:
for n, L in enumerate(f):
try:
print(n, L.decode('utf8', 'ignore'))
except Exception as e:
print(n, 'vot blyad', e)
with open('holy_shit.csv', 'ab') as w:
w.write(L)
continue
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question