R
R
RockyMotion2019-05-27 09:03:32
Python
RockyMotion, 2019-05-27 09:03:32

When saving to csv, does python give an encoding error?

I'm trying to parse an html page, then transfer all the data to a Dataframe and save it to CSV, all operations worked fine, but as soon as I start saving to csv, the system gives an error:
UnicodeDecodeError: 'charmap' codec can't decode byte 0x98 in position 49: character maps to
Not so long ago I came across that such an error may not occur directly from the codec, for example, it popped up for me when a third-party file got into a folder with html files.
I have already tried all the options, I can not find what is wrong.
Program code:

# -*- coding: utf8 -*-
from bs4 import BeautifulSoup
import os
import pandas as pd

df = pd.DataFrame({
    'Column':['test']
})

path = 'C:\\Users\\Desktop\\folder'
os.chdir(path)

def main(x):
    html = open(x)
    soup = BeautifulSoup(html, 'html.parser')
    div = soup.find_all('a', class_='title-link')
    for i in div:
        b = i.get_text()
        df.loc[len(df)]=[b]
        print(df)
    return df

for filename in os.listdir(path):
    main(filename)

df.to_csv('C:\\Users\\Desktop\\out.csv', sep='\t', encoding='utf-8')

Full error:
Traceback (most recent call last):
File "C:/Users/PycharmProjects/dataset/parser.py", line 23, in
main(filename)
File "C:/Users/PycharmProjects/dataset/parser.py" , line 14, in main
soup = BeautifulSoup(html, 'html.parser')
File "C:\Users\PycharmProjects\dataset\venv\lib\site-packages\bs4\__init__.py", line 244, in __init__
markup = markup.read()
File "C:\Users\AppData\Local\Programs\Python\Python37-32\lib\encodings\cp1251.py", line 23, in decode
return codecs.charmap_decode(input,self.errors, decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x98 in position 49: character maps to

Answer the question

In order to leave comments, you need to log in

1 answer(s)
R
RockyMotion, 2019-05-27
@RockyMotion

It helped to change:
to:

with open(x, 'rb') as f:
    html = f.read()

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question