M
M
Mom's Programmer2020-07-01 13:54:16
Python
Mom's Programmer, 2020-07-01 13:54:16

An error in the python parser code, what is the error?

from bs4 import BeautifulSoup
import requests

def parse():
  URL = 'https://www.olx.ua/list/'
  HEADERS = {
   'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
  }

  response = requests.get(URL,
  headers = HEADERS)
  soup = BeautifulSoup(response.content, 'html.parser')
  items = soup.findAll('div', class_ = 'offer-wrapper')
  comps = []

  for item in items:
    comps.append({
      'title': item.find('a', class_ = 'marginright5 link linkWithHash detailsLink').get_text(strip = True),
      'price': item.find('p', class_ = 'price').get_text(strip = True),
      'link': item.find('a', class_ = 'marginright5 link linkWithHash detailsLink').get('href')
    })

  for comp in comps:
    print(f'{comp["title"]} -> Price: {comp["price"]} -> Link: {comp["link"]}')

parse()


the site has a p tag with the price class, but this error is still generated;

Traceback (most recent call last):
  File "pars2.py", line 26, in <module>
    parse()
  File "pars2.py", line 19, in parse
    'price': item.find('p', class_ = 'price').get_text(strip = True),
AttributeError: 'NoneType' object has no attribute 'get_text'

Answer the question

In order to leave comments, you need to log in

1 answer(s)
S
Sergey Karbivnichy, 2020-07-01
@Ekaterina_kuzkova

Most likely, top ads come first, and they may have a different class (did not understand).
You can just add try...except:

for item in items:
    try:
        comps.append({
          'title': item.find('a', class_ = 'marginright5 link linkWithHash detailsLink').get_text(strip = True),
          'price': item.find('p', class_ = 'price').get_text(strip = True),
          'link': item.find('a', class_ = 'marginright5 link linkWithHash detailsLink').get('href')
        })
    except:
        pass

  for comp in comps:
    print(f'{comp["title"]} -> Price: {comp["price"]} -> Link: {comp["link"]}')

parse()

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question