A
A
Alexander Kovalenko2021-04-09 23:04:56
Python
Alexander Kovalenko, 2021-04-09 23:04:56

For some reason, the Selenium parser does not receive all the elements?

The problem is that you need to get all the links to ads, and it only gets links to the first 2 ads and duplicates the links why

from selenium import webdriver
from time import sleep
base_link = 'https://www.milanuncios.com/moda-mujer/?vendedor=part&orden=relevance&fromSearch='

class MilanunciosParser(object):
    def __init__(self, driver):
        self.driver = driver  

    def parse(self):
        self.page()  

    def page(self):
        self.driver.get(base_link) 

        main_div = self.driver.find_elements_by_class_name('ma-AdCard-titleLink')  

        for url in main_div:  
            print(url.get_attribute('href'))

def main():
    driver = webdriver.Chrome()
    parser = MilanunciosParser(driver)
    parser.parse()


if __name__ == '__main__':
    main()


Conclusion:
https://www.milanuncios.com/abrigos-y-chaquetas/ultimo-dia-de-la-gran-oferta-394525989.htm
https://www.milanuncios.com/abrigos-y-chaquetas/ultimo-dia-de-la-gran-oferta-394525989.htm
https://www.milanuncios.com/jerseys-mujer/jersey-lana-negro-aplicaciones-386796487.htm
https://www.milanuncios.com/jerseys-mujer/jersey-lana-negro-aplicaciones-386796487.htm

Answer the question

In order to leave comments, you need to log in

1 answer(s)
A
Alexa2007, 2021-04-09
@KovalenkoA12

This is not a parser error, this is such a page, there really are two identical classes. We'll have to add a script that will check the list and remove duplicates

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question