How to correctly save the result after parsing BeautifulSoup?

A

Aibot922021-03-02 13:48:36

Python

Aibot92, 2021-03-02 13:48:36

Tell me how to save the parsing result correctly (it gives an error on the key):

error:

Traceback (most recent call last):
  File "/Users/alexs/Desktop/py/Parsing/parsing_mvi.py", line 62, in <module>
    pars()
  File "/Users/alexs/Desktop/py/Parsing/parsing_mvi.py", line 58, in pars
    seve_file(phone, FILE)
  File "/Users/alexs/Desktop/py/Parsing/parsing_mvi.py", line 17, in seve_file
    writer.writerow([items['title'], items['prise']])
KeyError: 'prise'

script code:

from bs4 import BeautifulSoup
from selenium import webdriver
import time
import csv


URL = '>>>'
HEDARS = {}
FILE = 'mvd.csv'

def seve_file(item,path):
    with open(path, 'w', newline='') as file:
        writer = csv.writer(file, delimiter = ';')
        writer.writerow(['модель', 'цена'])
        for items in item:
            writer.writerow([items['title'], items['prise']])

def get_himl(url):
    driver = ('/Users/alexs/Desktop/py/Parsing/geckodriver')
    option = webdriver.FirefoxOptions()
    option.set_preference('dom.webdriver.enabled', False)
    browser = webdriver.Firefox(executable_path=driver, options=option)
    browser.get(url)
    time.sleep(2)
    browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")
    time.sleep(1)
    r = browser.page_source
    browser.quit()
    return r



def get_content(html):
    soup = BeautifulSoup(html, 'html.parser')
    items = soup.find_all('div', class_='product-cards-layout product-cards-layout--grid')
    name = soup.find_all('a', class_='product-title__text product-title--clamp')
    prise = soup.find_all('span', class_='price__main-value ng-star-inserted')
    phone = []
    for name in name :
        phone.append({
            'title': name.get_text()
        })
    for prise in prise:
        phone.append({
            'prise' : prise.get_text().replace('\xa0', ' ')
        })
    return (phone)

def pars():
    phone = []

    for page in range(1,4):
        print(f'Анализ {page}  ...')
        a = URL + '&page=' + str(page)
        html = get_himl(a)
        phone.extend(get_content(html))
        seve_file(phone, FILE)
    print(f'найдено ' + str(len(phone)) + ' телефонов')


pars()

Reply

Answer the question

In order to leave comments, you need to log in

1 answer(s)

E

Evgeniy _, 2021-03-02
@Aibot92

First, respect users and provide a listing of the error.
Second, the key error occurs because you're not correctly assembling the phone object.
You have it, if you write the xpath correctly, it looks like this (list):

[{'title': ' Смартфон Samsung Galaxy S21 128GB Phantom Violet (SM-G991B) '}, {'title': ' Смартфон Apple iPhone 12 128GB Black (MGJA3RU/A) '}, {'title': ' Смартфон Xiaomi Mi 10T 8+128GB Black '}, {'title': ' Смартфон Huawei Mate 40 Pro Mystic Silver (NOH-NX9) '}, {'title': ' Смартфон Nokia 3.4 3+64GB Blue (TA-1283) '}, {'title': ' Смартфон Xiaomi Redmi 9 3+32GB Carbon Grey '}, {'title': ' Смартфон Apple iPhone 11 128GB Black (MHDH3RU/A) '}, {'title': ' Смартфон Apple iPhone 11 64GB Black (MHDA3RU/A) '}, {'title': ' Смартфон Apple iPhone XR 64GB Black (MH6M3RU/A) '}, {'price': '67 990 руб.'}, {'price': '84 990 руб.'}, {'price': '40 990 руб.'}, {'price': '89 990 руб.'}, {'price': '11 490 руб.'}, {'price': '9 990 руб.'}, {'price': '59 990 руб.'}, {'price': '54 990 руб.'}, {'price': '44 990 руб.'}]

Do better like a dictionary. {phone: price}