V
V
Viktor Kokorich2020-09-18 09:59:01
Python
Viktor Kokorich, 2020-09-18 09:59:01

Did the parser give out None?

Made a parser, produces None. I assume that this is due to the huge amount of javascript on the site.
You need to parse the photo. It is possible to parse the price and name, but when trying to parse the src of a photo, it returns None.
Will one BS do it, or should I use Selenium? I would be very grateful :)
PS This is my first parser.

The code:

import requests
from bs4 import BeautifulSoup

HOST = "https://irr.ru/cars/passenger/"
URL = "https://irr.ru/cars/passenger/lexus/"
HEADERS = {
    'user-agent': 'tyt user-agent',
    'accept': 'tyt accept'   
    }

def get_html(url, params=''):
    r = requests.get(url, headers=HEADERS, params=params)
    return r


def get_content(html):
    soup = BeautifulSoup(html, 'html.parser')
    items = soup.find_all('div', class_='listing__item')
    wgg = []

    for item in items:
        wgg.append(

            {
                'title':item.find('div', class_='listing__itemInner').find('div', class_='listing__itemColumn listing__itemColumn_main').find('div', class_='listing__itemTitleWrapper').find('a', class_='listing__itemTitle').find('div', class_='js-productListingProductName').get_text(),
                'link':item.find('div', class_='listing__itemInner').find('div', class_='listing__itemColumn listing__itemColumn_main').find('div', class_='listing__itemTitleWrapper').find('a').get('href'),
                'foto':item.find('div', class_='listing__itemInner').find('div', class_='listing__itemColumn').find('div', class_='listing__imageWrapper').find('img',).get('src') ,
                'cena':item.find('div', class_='listing__itemInner').find('div', class_='listing__itemColumn listing__itemColumn_price').find('div', class_='listing__itemPrice').get_text(),
              
            }

        )

    return wgg


html = get_html(URL)

print(get_content(html.text))


Here is the result:

[{'title': 'Lexus LX 570', 'link': ' https://irr.ru/cars/passenger/used/lexus-lx-570-vn... ', 'foto ': None, 'cena': '\n \t 2\xa0345\xa0000\xa0RUB\n \t '}, {'title': 'Lexus GX 470', 'link': ' https://irr. ru/cars/passenger/used/lexus-gx-470-vn... ', 'foto': None, 'cena': '\n \t 1\xa0200\xa0000\xa0RUB\n \t '}, {'title': 'Lexus RX 270', 'link': ' https://irr.ru/cars/passenger/used/lexus-rx-270-vn... ', 'foto': None, 'cena ': '\n \t 1\xa0320\xa0000\xa0RUB\n \t '}, {'title': 'Lexus GS 250', 'link': ' https://irr.ru/cars/passenger/used/lexus-gs-250-se...', 'foto': None, 'cena': '\n \t 1\xa0500\xa0000\xa0RUB\n \t '}, {'title': 'Lexus LX 570', 'link': ' https: //irr.ru/cars/passenger/used/lexus-lx-570-vn... ', 'foto': None, 'cena': '\n \t 1\xa0839\xa0999\xa0rub.\n \ t '}, {'title': 'Lexus RX 350', 'link': ' https://irr.ru/cars/passenger/used/lexus-rx-350-vn... ', 'foto': None, 'cena': '\n \t 845\xa0000\xa0RUB\n \t '}, {'title': 'Lexus LS 460', 'link': ' https://irr.ru/cars/ passenger/used/lexus-ls-460-se... ', 'foto': None, 'cena': '\n \t 1\xa0400\xa0000\xa0RUB\n \t '}, {'title': 'Lexus LX', 'link': ' https://irr.ru/cars/passenger/used/lexus- lx limuzi...', 'foto': None, 'cena': '\n \t 3\xa0900\xa0000\xa0RUB\n \t '}, {'title': 'Lexus LX 570', 'link': ' https: //irr.ru/cars/passenger/used/lexus-lx-570-vn... ', 'foto': None, 'cena': '\n \t 3\xa0400\xa0000\xa0rub.\n \ t'}]

Answer the question

In order to leave comments, you need to log in

1 answer(s)
D
datka, 2020-09-18
@66656665

'foto':item.find('img', class_='listing__image')["data-src"]

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question