Answer the question
In order to leave comments, you need to log in
When saving to csv, does python give an encoding error?
I'm trying to parse an html page, then transfer all the data to a Dataframe and save it to CSV, all operations worked fine, but as soon as I start saving to csv, the system gives an error:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x98 in position 49: character maps to
Not so long ago I came across that such an error may not occur directly from the codec, for example, it popped up for me when a third-party file got into a folder with html files.
I have already tried all the options, I can not find what is wrong.
Program code:
# -*- coding: utf8 -*-
from bs4 import BeautifulSoup
import os
import pandas as pd
df = pd.DataFrame({
'Column':['test']
})
path = 'C:\\Users\\Desktop\\folder'
os.chdir(path)
def main(x):
html = open(x)
soup = BeautifulSoup(html, 'html.parser')
div = soup.find_all('a', class_='title-link')
for i in div:
b = i.get_text()
df.loc[len(df)]=[b]
print(df)
return df
for filename in os.listdir(path):
main(filename)
df.to_csv('C:\\Users\\Desktop\\out.csv', sep='\t', encoding='utf-8')
Answer the question
In order to leave comments, you need to log in
It helped to change:
to:
with open(x, 'rb') as f:
html = f.read()
Didn't find what you were looking for?
Ask your questionAsk a Question
731 491 924 answers to any question