U
U
ubirust2021-12-25 13:57:36
Python
ubirust, 2021-12-25 13:57:36

Error "IndexError: list index out of range" in Python when parsing Avito, how can I fix it?

I welcome everyone! I don't understand how to fix this "IndexError: list index out of range" error. Previously, the code worked, it turned out to be parsed.

The code itself:

from bs4 import BeautifulSoup  # для парсинга старниц
import requests                # для запросов к сайту, получения содержимого веб-страницы
from requests import get
import time
import random
import urllib.request

url = 'https://www.avito.ru/moskva/kvartiry/sdam/na_dlitelnyy_srok-ASgBAgICAkSSA8gQ8AeQUg?cd=1&s=104&user=1'
houses = []
count = 1
while count <= 1:
    url = 'https://www.avito.ru/moskva/kvartiry/sdam/na_dlitelnyy_srok-ASgBAgICAkSSA8gQ8AeQUg?cd=1&s=104&user=1'
    print(url)
    response = get(url)
    html_soup = BeautifulSoup(response.text, 'html.parser')

    house_data = html_soup.find_all('div', class_="iva-item-content-m2FiN")
    if house_data != []:
        houses.extend(house_data)
        value = random.random()
        scaled_value = 1 + (value * (9 - 5))
        print(scaled_value)
        time.sleep(scaled_value)
    else:
        print('empty')
        break
    count += 1

print(len(houses))
print(houses[0])
print()
n = int(len(houses)) - 1
count = 0
while count <= 5:  # count <= n
    info = houses[int(count)]
    price = info.find('span', {"class": "price-text-1HrJ_ text-text-1PdBw text-size-s-1PUdo"}).text
    title = info.find('h3', {"class": "title-root-395AQ iva-item-title-1Rmmj title-listRedesign-3RaU2 title-root_maxHeight-3obWc text-text-1PdBw text-size-s-1PUdo text-bold-3R9dt"}).text
    geo = info.find('span', {"class": "geo-address-9QndR text-text-1PdBw text-size-s-1PUdo"}).text
    date = info.find('div', {"data-marker": "item-date"}).text
    link = info.find('a', class_='link-link-39EVK link-design-default-2sPEv title-root-395AQ iva-item-title-1Rmmj title-listRedesign-3RaU2 title-root_maxHeight-3obWc').get('href')
    print(title,' ', price,geo,date,"https://www.avito.ru"+link)
    count += 1


This is the error that comes out:
https://www.avito.ru/moskva/kvartiry/sdam/na_dlitelnyy_srok-ASgBAgICAkSSA8gQ8AeQUg?cd=1&s=104&user=1
empty
0
Traceback (most recent call last):
  File "C:/go/GO.py", line 30, in <module>
    print(houses[0])
IndexError: list index out of range

Process finished with exit code 1


What could be the problem? How to fix?

Answer the question

In order to leave comments, you need to log in

2 answer(s)
A
alexbprofit, 2021-12-25
@alexbprofit

content on the site is generated using javascript, use Selenium for parsing

S
soremix, 2021-12-25
@SoreMix

There housesare no elements in. You are trying to get the first element. It does not exist

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question